Due to the changes in #244, this has caused a slightly annoying regression for Unix-based users (Linux, MacOS) especially when checking out the code and switching branches. There will always be ghost changes present for certain files (due to the line breaks differences, LF on Linux/MacOS and CRLF on Windows), leading to issues when trying to switch branches (that there are unstaged changes), stash not seeming to work (as the changes automatically appear back), and issues in reverting as mentioned in the Telegram group.
The cause of the issue appears to be forcing incompatible line endings, and git will try to "resolve" the issues but proves to be counter-intuitive due to the issues aforementioned.
As such, it will be better to let git handle the normalisation of the line endings, to apply the default line ending configuration for text files (which is LF) by specifying that in .gitattributes. This will ensure all code checked in to the repository be formatting uniformly with the same line breaks (i.e. LF).
Reasons for choosing this strategy instead of other solutions
Quoting from the blog post in [2]:
Git can actually be configured to automatically handle line endings using a setting called autocrlf. This automatically changes the line endings in files depending on the operating system. However, you shouldn't rely on people having correctly configured Git installations. If someone with an incorrect configuration checked in a file, it would not be easily visible in a pull request and you'd end up with a repository with inconsistent line endings.
The solution to this is to add a .gitattributes file at the root of your repository and set the line endings to be automatically normalised like so:
Previously mentioned by Cloud7050 https://github.com/source-academy/modules/issues/209#issuecomment-1511643124, we should have something easy to implement (i.e. no big codebase changes/commits, ease of use for new users/setup). By setting setting the text mode for .gitattribute, this will automatically normalise the line endings, even if the users does not have core.autocrlf setup yet. (refer to the quote below from the StackOverflow post [3])
meaning that all files (except specializations) that git auto-detects as non-binary (text), and which have LF in the git database[see note 1.], will get CRLF whenever:
• core.autocrlf is true, or
• core.eol is crlf, or
• core.eol is native (default) and you're on a Windows platform.
In all other cases, you get LF.
Another comment by shenyih0ng https://github.com/source-academy/modules/pull/210#issuecomment-1549708443 was that we should use .editorconfig file to enforce the line endings, however, in that article stated it was mentioned that it was a bonus part and even if it was to implemented, the preferred line style ending would be LF, and that differs from our current configuration of CRLF. As quoted from the blog post [1]:
However, as we just saw, you may still see CRLF line endings on Windows locally because .gitattributes doesn’t tell Git to change the working copies of your files.
Again, this doesn’t mean that Git’s normalization process isn’t working; it’s just the expected behavior. However, this can get annoying if you’re also linting your code with ESLint and Prettier, in which case they’ll constantly throw errors and tell you to delete those extra CRs:
Since we will not be applying eslint rules locally as part of the formatting rules, and we do not really want to enforce working directory (unless there a reason for that, the code checked in to the remotes will be in LF either ways), there does not seem much reason as such and remain highest compatibility with the current system irrespective of whatever operating system you are using.
Type of change
[x] Bug fix (non-breaking change which fixes an issue)
[ ] New feature (non-breaking change which adds functionality)
[ ] Breaking change (fix or feature that would cause existing functionality to not work as expected)
[ ] This change requires a documentation update
The first commit restores the .gitattributes back to auto detection mode without enforcing a specific line ending (and will default to LF), whereas the second commit revert all the files that have been commited to the repo since the merger of #244 to their LF line endings.
Merging this PR will close #209 as well.
Checking Line Endings as a Github Action Task
After some consideration on how the behavior of .gitattributes, it appears that setting this default value project-wide should be able to ensure uniformity of line endings across the repo, even for new users/contributors without explicitly setting up their git configuration to deal with line endings. (ensures that all code pushed to remote will be LF)
There is also additional compute time required to run the linting (via eslint) to check the line breaks, and that if there is any line break issues in a handful number of files, this can easily lead to a memory overflow in the node runtime as eslint only prints out the errors after all the files have been parsed, which could possibly obfuscate the issue (of line breaks) behind memory errors and failing tests.
As such, I have decided against including a check in the Github Actions runner for new pull requests.
Related Issues/PRs & References
210
209
Some reference articles used to evaluate the best solution moving forward, and they're definitely worth a good read/look.
Description
Due to the changes in #244, this has caused a slightly annoying regression for Unix-based users (Linux, MacOS) especially when checking out the code and switching branches. There will always be ghost changes present for certain files (due to the line breaks differences,
LF
on Linux/MacOS andCRLF
on Windows), leading to issues when trying to switch branches (that there are unstaged changes), stash not seeming to work (as the changes automatically appear back), and issues in reverting as mentioned in the Telegram group.The cause of the issue appears to be forcing incompatible line endings, and git will try to "resolve" the issues but proves to be counter-intuitive due to the issues aforementioned.
As such, it will be better to let
git
handle the normalisation of the line endings, to apply the default line ending configuration for text files (which isLF
) by specifying that in.gitattributes
. This will ensure all code checked in to the repository be formatting uniformly with the same line breaks (i.e.LF
).Reasons for choosing this strategy instead of other solutions
Quoting from the blog post in [2]:
Previously mentioned by Cloud7050 https://github.com/source-academy/modules/issues/209#issuecomment-1511643124, we should have something easy to implement (i.e. no big codebase changes/commits, ease of use for new users/setup). By setting setting the
text
mode for.gitattribute
, this will automatically normalise the line endings, even if the users does not havecore.autocrlf
setup yet. (refer to the quote below from the StackOverflow post [3])Another comment by shenyih0ng https://github.com/source-academy/modules/pull/210#issuecomment-1549708443 was that we should use
.editorconfig
file to enforce the line endings, however, in that article stated it was mentioned that it was a bonus part and even if it was to implemented, the preferred line style ending would beLF
, and that differs from our current configuration ofCRLF
. As quoted from the blog post [1]:Since we will not be applying eslint rules locally as part of the formatting rules, and we do not really want to enforce working directory (unless there a reason for that, the code checked in to the remotes will be in LF either ways), there does not seem much reason as such and remain highest compatibility with the current system irrespective of whatever operating system you are using.
Type of change
The first commit restores the
.gitattributes
back to auto detection mode without enforcing a specific line ending (and will default to LF), whereas the second commit revert all the files that have been commited to the repo since the merger of #244 to their LF line endings.Merging this PR will close #209 as well.
Checking Line Endings as a Github Action Task
After some consideration on how the behavior of
.gitattributes
, it appears that setting this default value project-wide should be able to ensure uniformity of line endings across the repo, even for new users/contributors without explicitly setting up their git configuration to deal with line endings. (ensures that all code pushed to remote will be LF)There is also additional compute time required to run the linting (via eslint) to check the line breaks, and that if there is any line break issues in a handful number of files, this can easily lead to a memory overflow in the node runtime as eslint only prints out the errors after all the files have been parsed, which could possibly obfuscate the issue (of line breaks) behind memory errors and failing tests.
As such, I have decided against including a check in the Github Actions runner for new pull requests.
Related Issues/PRs & References
210
209
Some reference articles used to evaluate the best solution moving forward, and they're definitely worth a good read/look.