florianschanda / miss_hit

MATLAB Independent, Small & Safe, High Integrity Tools - code formatter and more
GNU General Public License v3.0
158 stars 21 forks source link

Set specific newline character for autofix #291

Open paco-sevilla opened 3 weeks ago

paco-sevilla commented 3 weeks ago

What kind of feature is this?

Your MATLAB/Octave environment

MISS_HIT component affected

Describe the solution you'd like It would be great if the user could set a specific newline character in the configuration file, which would be used when calling mh_style --fix. The options could be lf, crlf and platform_dependent, which would set newline= to "\n", "\r\n" or None respectively in the open function here: https://github.com/florianschanda/miss_hit/blob/f01fbf9cd43c52fdb8fe6dce162c9772637ec199/miss_hit_core/work_package.py#L114 This would be of great help to get reproducible results for teams developing in multiple different platforms.

florianschanda commented 2 weeks ago

It is my understanding that Python always does the platform_dependent thing, right?

The correct solution is to put into your .gitattributes file that your matlab sources should be also platform dependent (i.e. text=auto).

paco-sevilla commented 2 weeks ago

Hi @florianschanda, thanks for your reply.

It is my understanding that Python always does the platform_dependent thing, right?

Correct. As you see in the Python docs, when open is used with newline=None (the default) then os.linesep is used, which is platform dependent.

The correct solution is to put into your .gitattributes file that your matlab sources should be also platform dependent (i.e. text=auto).

I thought about this before creating this issue, but ultimately I disagree.

My expectation of a formatter is that, if a file has been formatted, rerunning the formatter on that file doesn't change its contents at all, regardless of the git configuration.

This means that the formatter should either:

  1. keep the line-endings that the file had before OR
  2. always normalize line-endings to CRLF or LF

For Python, both black and yapf keep the line-endings. For C/C++, for example, clang-format provides options to normalize line-endings: Clang-Format Stype Options - LineEnding

paco-sevilla commented 2 weeks ago

BTW, I have text=auto in my .gitattributes. However, I also have eol=lf as described here: VSCode -Resolving Git line ending issues in containers (resulting in many modified files).

I would need to add special handling for .m files, which feels to me like a workaround.

florianschanda commented 2 weeks ago

My expectation of a formatter is that, if a file has been formatted, rerunning the formatter on that file doesn't change its contents at all, regardless of the git configuration.

I am 99% with you here, except for the line endings.

If you're on Linux, you should have things set up so that when you check out your repo with m files you get LF files. Running MISS_HIT there modifies files to get you LF files.

Of you're on Windows, you should have things set up so that when you check out your repo with m files you get CRLF files. Running MISS_HIT there modifies files to get you CRLF files.

My understanding is that the git config core.autocrlf is doing that, and should be the correct solution to your problem.

https://docs.github.com/en/get-started/getting-started-with-git/configuring-git-to-handle-line-endings#global-settings-for-line-endings

florianschanda commented 2 weeks ago

Further, I think the advice to set eol=LF is bad advice. It will basically ensure that you have other surprises like this, in other tools. I see two use cases for it:

paco-sevilla commented 2 weeks ago

Further, I think the advice to set eol=LF is bad advice. It will basically ensure that you have other surprises like this, in other tools.

~That's not the case, but exactly the opposite. This makes sure that Git always stores files with LF line endings, regardless of the platform. This is the option that actually ensures that Git only shows relevant changes, even when used on different platforms. Otherwise, user A would commit a file with LF and then user B could make a new commit with CRLF and every line would show changes. With eol=lf, line endings get normalized, so when user B does a git add, the file only shows relevant changes. The autocrlf config can be used to checkout files with CRLF on Windows. See the Git docs: End-of-line conversion.~ EDIT: what I wrote here before was not correct.

When using WSL (and Docker) on Windows, it's recommended to use LF line endings, in order ensure compatibility with Linux. That's why I have eol=lf and core.autocrlf set to false.

I am 99% with you here, except for the line endings.

I don't quite understand why we don't agree 100%.

As I already mentioned, formatters for other languages allow the user to decide which line endings to use. Is there a reason why .m files should always have platform-dependent line endings? I can't think of any reason why line endings would cause any issue for Matlab or Octave, see this for example.

The point of this simple feature request is to give the user the opportunity to decide which line endings to use, leaving Git completely aside.

I think that using the same line endings as before the --autofix would be the simplest solution (it wouldn't even need a new config setting) and that IMO would give intuitive results. However the behavior would not be backwards compatible without a config setting.

BTW, I can create a PR for the feature once we agree on how to proceed 🙂

florianschanda commented 2 weeks ago

I believe you're mistaken in one part: if you set eol=LF, and use autocrlf, you still get LF line endings.

I am still opposed to this, because people should just use git correctly, but I am not so opposed to not do what my users want. :) I'll add a config flag for it, with the default behaviour being what it is right now.

paco-sevilla commented 2 weeks ago

Sorry about that, you're right, I was mistaken. I misread the Git documentation. I've now edited my comment above.

I'll add a config flag for it, with the default behaviour being what it is right now.

That would be awesome! Thank you