actions / checkout

Action for checking out a repo
https://github.com/features/actions
MIT License
5.9k stars 1.74k forks source link

Windows - Line endings & conversion #135

Open MSP-Greg opened 4 years ago

MSP-Greg commented 4 years ago

Many repos, use a line ending of \n.

When using Windows, these seem to be converted. Often, this isn't an issue with actual code, but has messed up testing. The following seems to fix it, run before the checkout action:

git config --system core.autocrlf false
git config --system core.eol lf

If this could be an option with 'checkout', some might find it helpful...

ericsciple commented 4 years ago

would .gitattributes solve the problem?

MSP-Greg commented 4 years ago

Yes. I use them in my repos.

But, it's a change for people migrating from AppVeyor (along with UTF8 console encoding).

Also, for repos that are predominately *nix software that support Windows, may have binary files, etc...

Close if you'd like, just thought I'd mention it.

devsnek commented 4 years ago

With the temp fix in the OP, I'm getting a permissions error about trying to write to /etc/gitconfig. Any suggestions?

MSP-Greg commented 4 years ago

@devsnek

I apologize, I didn't state in the original post that I noticed the issue when working with ps1 or cmd shells. I've also had the issue locally. Also, locally, I almost never work in the Git bash (or MSYS2) shell.

If you're using a bash shell, maybe try:

git config --global ...

I don't know...

lydell commented 4 years ago

This worked for me:

    steps:
      - name: Set git to use LF
        run: |
          git config --global core.autocrlf false
          git config --global core.eol lf

      - uses: actions/checkout@v2
myitcv commented 4 years ago

Related is https://travis-ci.community/t/files-in-checkout-have-eol-changed-from-lf-to-crlf/349.

Per my comment in https://travis-ci.community/t/files-in-checkout-have-eol-changed-from-lf-to-crlf/349/3, I don't think the default config of core.autocrlf=true on a runner is correct, because otherwise there’s a considerable amount of work required to undo this behaviour.

I would suggest making the default on all runners false and then allowing people to configure it to true as required (not that this should really ever be required).

mvdan commented 4 years ago

I raised the same issue on the forum last year, and while it did get an initial reply from @ethomson, it hasn't seen any response since.

ethomson commented 4 years ago

If this could be an option with 'checkout', some might find it helpful...

Git already has a configuration option, it's the .gitattributes file, which has the added property of being universally supported by all systems, not just GitHub Actions.

Per my comment in https://travis-ci.community/t/files-in-checkout-have-eol-changed-from-lf-to-crlf/349/3, I don't think the default config of core.autocrlf=true on a runner is correct, because otherwise there’s a considerable amount of work required to undo this behaviour.

core.autocrlf's default is true because that's the default in Git for Windows.

Git for Windows installer options page

Most people use this unchanged, and so our runners are configured to match the most common setting. We see the majority of Windows users have configured this option. To change it to false would hurt most repositories that are missing a .gitattributes.

There's no "correct" choice here for users to make: it's purely a preference. And in the absence of correctness, we have to choose the option that supports the most users correctly.

But this is a guess. And we don't want to be in the business of guessing what to do about your line endings. We'd strongly prefer for us to tell us what you want to do. You can do that with a .gitattributes. This places the configuration about your repository where it belongs - in the repository itself.

To emulate the behavior of core.autocrlf=true, add a .gitattributes that is * text=auto. To emulate core.autocrlf=false, add a .gitattributes that is * -text. This will disable text translations (line ending changes).

More information is available in this blog post

myitcv commented 4 years ago

Thanks @ethomson. Doesn't setting * -text have the side effect that when files are added on Windows (for example) they will not have their line endings normalised to LF?

ethomson commented 4 years ago

@myitcv That's correct - sorry, I did sort of gloss over the many combinations of settings that are possible. 😰

Thanks @ethomson. Doesn't setting * -text have the side effect that when files are added on Windows (for example) they will not have their line endings normalised to LF?

Yes - that's correct - * -text will prevent any line ending translation. This should map to core.autocrlf=false in the absence of a core.eol setting.

There is an interesting bit to the translation of core.autocrlf/core.eol and .gitattributes - core.autocrlf=false is required for core.eol to have any effect - on the other hand, the eol attribute implies the text attribute (not -text).

I don't think that I had really realized this seeming mismatch until you asked - it's not that core.autocrlf precisely means "do text translation" when true, you can set core.autocrlf=false and still see text translation, depending on the setting of core.eol.

I think that all the options for core.autocrlf plus core.eol are expressible with attributes.

Configuration: core.autocrlf=true Attributes: * text=auto Notes: core.eol cannot be used when core.autocrlf=true

Configuration: core.autocrlf=input Attributes: * text eol=lf Effect: Files on disk will are LF (all platforms), files in repository are LF. (This is equivalent to core.autocrlf=false with core.eol=lf.) core.eol cannot be used when core.autocrlf=true

Configuration: core.autocrlf=false Attributes: * -text Effect: Files on disk are whatever they were created in, checked in to the repository literally

Configuration: core.autocrlf=false core.eol=lf Attributes: * text eol=lf Effect: Files on disk are LF (all platforms), files in repository are LF

Configuration: core.autocrlf=false core.eol=crlf Attributes: * text eol=crlf Effect: Files on disk are CRLF (all platforms), files in repository are LF

Configuration: core.autocrlf=false core.eol=native Attributes: * text=auto Effect: Files on disk are CRLF (Windows), LF (everywhere else), files in repository are LF

myitcv commented 4 years ago

Thanks @ethomson

There's no "correct" choice here for users to make:

This statement is the key here, and to your earlier point it makes sense for the default behaviour to be that one which will be most useful to the majority.

ericsciple commented 4 years ago

I'm going to add a troubleshooting doc. I'll add a section for this

spk121 commented 3 years ago

I also ran into this surprise. Using .gitattributes fixed it, but, I rather expected that there might be a hint about it in the README for checkout@v2

lydell commented 3 years ago

I’ve actually changed my mind on this. I now think it’s good that files are checked out with CRLF on Windows, because it’s helped me find two legit CRLF bugs in two projects.

My original issue was that Prettier failed on Windows because it mandates LF newlines. The real solution to that is to only run Prettier on one OS (like Ubuntu, since it’s fast) and one Node.js version (instead of on all combinations, like I did before). It’s totally worth having a separate workflow for linting – it’s much faster!

Though I’ll have to note that I’ve had a few really annoying test failures due to CRLF, too, that didn’t provide any value and that I had to code around. But it still felt good finding those actual CRLF bugs.

kriegaex commented 3 years ago

I agree that this should be configurable, because project conventions and user preferences are not the same everywhere. For example, I use "check out as-is, commit Unix-style line endings" whenever I install Git for Windows. I also have compelling reasons why I do this:

Therefore, duplicate caches (in my case for a local Maven repository) are falsely being created only because the checkout action does not support config options for the corresponding Git options. I noticed this accidentally after having used the cache action for two years already in my projects. Now I am supposed to add extra workflow steps to each of them as a workaround? Besides, all other checkout action users who wish to have efficient caching (saving storage, I/O, bandwidth and computation resources provided by GitHub) also have to be lucky enough to (a) detect this problem and (b) find the workaround in an open issue, instead of simply finding a hint in this action's documentation, which then also could be pointed to by the cache action's documentation, so as to educate users about how to avoid this problem.

ethomson commented 3 years ago

Therefore, duplicate caches (in my case for a local Maven repository) are falsely being created only because the checkout action does not support config options for the corresponding Git options

It does - those configuration options should be specified in .gitattributes, where they will be honored by all git clients, including actions/checkout.

kriegaex commented 3 years ago

I need an option for this action only, not for all clients.

kriegaex commented 3 years ago

BTW, I am requesting an option here, not a change of default behaviour. Like someone said earlier, there is no one right way to configure a Git client on Windows. It depends on the project, organisational conventions, personal preferences. Therefore, making it an option in this action is an adequate, user-friendly way to address the issue without forcing millions of projects to add a .gitattributes file affecting all users cloning the repository, just because they want to use the same cache on Windows as for the other platforms for GitHub actions.

kmilos commented 2 years ago

The workaround no longer seems to work:

Run git config --global core.autocrlf input
  git config --global core.autocrlf input
  Error: Second path fragment must not be a drive or UNC name. (Parameter 'expression')

Edit: Sorry for the noise, it was because I changed the default shell...

undergroundwires commented 2 years ago

Creating .gitattributes file as following solved the issue for me:

* text=auto eol=lf 

It prevents Git from auto-converting to CRLF on Windows, and convert to LF on checkin.

Part Meaning
* All files
text=auto If Git decides content is text, it converts CRLF to LF on checkin.
eol=lf Forces Git to normalize line endings to LF on checkin and prevents conversion to CRLF when the file is checked out.
kriegaex commented 2 years ago

@undergroundwires: Like I said before: I need an option for this action only. A .gitattributes file affects all Git clients, which is not what most projects would want. It is nice that it works for you, but for me it is not an option to add a global Git configuration just for the sake of the Java checkout option.

ilovelinux commented 1 year ago

Partially related to https://github.com/actions/runner/issues/2071

exFalso commented 5 months ago

There's no "correct" choice here for users to make: it's purely a preference. And in the absence of correctness, we have to choose the option that supports the most users correctly.

I would have to disagree, it's very clearly wrong to set this to anything but false. It can cause strange reproducibility issues, merge issues, hash mismatches etc etc. This is even more so true for git actions which are commonly used to build release artifacts. I don't even understand why there is a setting for this in git, what does the line ending have to do with revision control?

Also, just to demonstrate, see the list of issues linking this issue. And these are just the public issues.

MSP-Greg commented 5 months ago

Haven't looked at this for a while. Much of the discussion seems to assume changing checkout's defaults.

What I should have mentioned is something like adding an input option like win-force-lf. Don't change the defaults, but allow an easy fix that doesn't require knowledge of git config or access to a Windows system.

I have often helped repos that have no maintainers/owners that use Windows...

exFalso commented 5 months ago

It's not the default! The default is false (see here), and the windows installer (which is installing on a dev machine) picks a different default. Why is the installer used as a baseline for a headless github action?

ethomson commented 5 months ago

You shouldn't be using core.autocrlf. Line ending configuration is per repository - the configuration affects not only how things are placed on disk but also in the repository, so to rely on core.autocrlf would require every user of the repository to set it to the same value on their local machine.

This is impractical, as this discussion shows. Check in your line ending configuration with gitattributes. This ensures that all developers have the same settings for each repository they use.

exFalso commented 5 months ago

We have zero developers using windows. The only reason we ended up at this issue is because we had a nasty reproducibility bug that affected customers using windows, as our build artifacts had these random carriage returns due to this setting. It's infuriating, because this setting literally doesn't make sense in a github action context. In what situation is it useful? Do people open windows code editors in the git checkout of the action run?

ethomson commented 5 months ago

The only reason we ended up at this issue is because we had a nasty reproducibility bug that affected customers using windows, as our build artifacts had these random carriage returns due to this setting.

Then check in a gitattributes file so that that doesn't happen.

It's infuriating, because this setting literally doesn't make sense in a github action context. In what situation is it useful?

When you have 100% of developers using Windows using an actions workflow and they have not bothered to correctly configure gitattributes.

(Which is more likely? 100% of developers are writing code on Linux and are doing a build on Windows, or 100% of developers are writing code on Windows and then doing their build on Windows? It's the latter.)

There is no effective way to communicate what should be done to line endings except to use gitattributes.

core.autocrlf is an unmitigated disaster that should have never existed. git and Git for Windows give it different defaults and it's configured in the wrong place. gitattributes is the correct way to tell git how to behave in a cross platform environment.

exFalso commented 5 months ago

Agreed, the option's existence itself is bizarre.

But also, even if you have only windows developers, how does this setting help in a GitHub Action context? Those windows developers used the default installer setting during development, fine, but in what circumstance should the Action itself set it? The only possible scenario I can imagine is if the Action itself somehow modified files in the repository and pushed the changes.. am I missing something? Isn't the only reasonable behaviour for headless contexts to leave the characters as-is?

MSP-Greg commented 5 months ago

I wasn't clear when I opened this issue. I'm not referring to commits. For example, I've seen instances where CI was testing the construction of a string, and it was compared to the contents of a test file 'fixture'. This may lead to failures due to EOL settings.

I've got no problem with the current defaults, maybe a better input option would be force-eol, with options of crlf or lf?

ethomson commented 5 months ago

But also, even if you have only windows developers, how does this setting help in a GitHub Action context? Those windows developers used the default installer setting during development, fine, but in what circumstance should the Action itself set it? The only possible scenario I can imagine is if the Action itself somehow modified files in the repository and pushed the changes.. am I missing something? Isn't the only reasonable behaviour for headless contexts to leave the characters as-is?

core.autocrlf doesn't just control the data going into a repository (ie, adding to the index / committing), it also controls the data coming out. Assume that you have some developer who has installed Git for Windows, been (understandably) baffled by the multiple pages of dense configuration options you're prompted to deal with, and just (again, understandably) accepts the defaults. Now they have core.autocrlf = true set. Meaning that when they run git add, git will turn the CRLFs in that file into LFs.

Now, assume that you're a developer (or a CI system) that has core.autocrlf = false set (whether implicitly or explicitly). When you check that repository out, you'll get... LFs on disk. And, of course, for most applications this won't matter. In 2024, even notepad.exe can read files with CRLF. But cmd scripts still need to be \r\n, and I'm sure there are others. (Because meaningfully, the files aren't the same as what the original developer checked in.)

(Don't get me started about files with mixed line endings, that's just a whole separate problem.)

But anyway, core.autocrlf has to match the setting for every developer in a repository — whether they're creating or merely consuming content from it. This would be bad enough, except that this is usually set as a user-wide setting, which means that now every repository that a user uses also has to have the same settings.

(There's nothing good about this.)

The way around this is to put a gitattributes file in every repository you deal with — with the possible exception, I guess, maybe, of ones that will never, ever, never, absolutely never, you're really sure, be used by a Windows user, and even then, this problem is gross and thorny and terrible enough that I would do it anyway, just to make sure.

exFalso commented 5 months ago

cmd scripts. I have not considered that. My point was specifically that in a CI context there is no notepad.exe or vscode or similar being executed, but cmd scripts.. fair enough.

Re gitattributes: yes of course this works, but nobody's going to set this before encountering a related bug first.. I didn't even know about the existence of this option yesterday. There's just no good solution. Peace be upon all devs who encounter this mess.

Thank you for your patience

kriegaex commented 5 months ago

I am baffled, watching you guys philosophise here after the 4+ years the issue has been open. I just checked, I commenterd myself in 2021 already: https://github.com/actions/checkout/issues/135#issuecomment-889845025.

Just count the sheer number of related issues linking to or mentioning this one. By insisting that all projects world-wide should use .gitattributes instead of using a simple config option for this action, the problem does not go away. Reality is imperfect. Are you really expecting that millions of projects which did fine without .gitattributes so far will take the trouble to add such configuration just to avoid problems with a GitHub action? Sorry, that seems a wee bit unrealistic wishful thinking.

Look at the effort you expended debating this. Would it not have been better invested in just fulfilling this users' wish? I would think that the implementation would not be rocket science, if not trivial.

ethomson commented 5 months ago

It's not philosophical, it's practical.

Just count the sheer number of related issues linking to or mentioning this one. By insisting that all projects world-wide should use .gitattributes instead of using a simple config option for this action, the problem does not go away.

Every project world-wide that is touched by Windows users should use a .gitattributes file, period. That's how you define how line endings work on Git for Windows. Not just for this action, for all the Windows machines.

And yes - the problem does go away if you use .gitattributes. You could argue that making people configure .gitattributes is impractical (and that's a reasonable argument) but it does solve the problem at hand of distributing line-ending configuration about a repository in-band.

Reality is imperfect.

I agree — reality is imperfect. And git is imperfect. Git has saddled you with their historical misunderstanding of Windows, and how line endings should work in a version control system. They've moved the goalposts to requiring you to have a .gitattributes file, and then not actually told you this, sticking their fingers in their ears and pretending that there's no problem. There is a problem.

But — in reality, the solution to this problem is .gitattributes. This isn't philosophical. It's practical.

Are you really expecting that millions of projects which did fine without .gitattributes so far will take the trouble to add such configuration just to avoid problems with a GitHub action? Sorry, that seems a wee bit unrealistic wishful thinking.

They didn't do fine — that's why this issue is open. As you mentioned upthread:

For example, I use "check out as-is, commit Unix-style line endings" whenever I install Git for Windows

Okay, but how is this action supposed to know that? How is your colleague who clones this repo supposed to know that? What happens when you use a different git installation that looks in a different place for the configuration (Portable Git). What happens when you're accidentally in an Administrator terminal?

Yes, the answer to "how is this actions supposed to know" could be "we'll configure it in the YAML". Or it could be "we'll configure it in .gitattributes for this action, and every other thing that touches this repository".

Look at the effort you expended debating this. Would it not have been better invested in just fulfilling this users' wish? I would think that the implementation would not be rocket science, if not trivial.

I'm not sure who you're talking to — me? I used to work at GItHub, but I don't anymore. Now I'm just a helpful bystander who works on cross-platform git tooling and wants you to be successful despite git's obvious shortcomings in this area.

If you want to have an option for this in the action, it's obvious that GitHub isn't going to do anything here on their own, seeing as this issue is 4.5 years old. Someone will need to take it into their own hands to send a PR.

But my suggestion here is: don't. By making a change to this action, you're just putting your finger in the levee. You're going to keep running to every place that checks out a file, whether that's the other people that you write code with, or the build and deploy tools that interact with git. And you're going to keep telling them "SET core.autocrlf THIS WAY! NOTHING WORKS UNLESS YOU SET core.autocrlf JUST LIKE I DO!"

Or you can just check in a .gitattributes file and get on with your life, the way git wants you to (even if it has done a poor job of telling you that).

MichaelChirico commented 2 months ago

I'm totally lost here.

We have this in .gitattributes:

* -text

We also run

git config --global core.autocrlf false

Inside the GHA. But actions/checkout is still apparently converting the line endings for files.

What can I do to stop this behavior??

(context: https://github.com/Rdatatable/data.table/pull/6379/files)

ethomson commented 2 months ago

Inside the GHA. But actions/checkout is still apparently converting the line endings for files.

Hey @MichaelChirico, I'm having a little trouble understanding the problem; there are a lot of actions workflow runs right now so I don't see what the line ending problem is yet. (Feel free to @ me in the PR, it might make more sense to chat through this there.)

MichaelChirico commented 2 months ago

Thanks for reaching out @ethomson, after reaching my wit's end on that problem I took a step back and changed tack.

I was trying to set up a GHA to enforce \n line endings on files in the repo, but of course, just doing that directly through .gitattributes is the much better way to go about solving this problem. I still have no idea why actions/checkout was apparently ignoring my .gitattributes and core.autocrlf settings, but that's a problem for another day :)

(feel free to hide/remove these comments as a distraction from the thread)