ffmpeginteropx / FFmpegInteropX

FFmpeg decoding library for Windows 10 UWP and WinUI 3 Apps
Apache License 2.0
205 stars 52 forks source link

Enforce respecting EditorConfig's EOLs #430

Closed JunielKatarn closed 2 weeks ago

JunielKatarn commented 4 weeks ago

Introduction

As discussed on a previous PR, proposing the "what you edit is what you get" approach so source files' line endings are identical to committed files' line endings.

Motivation

All modern code editors, including Visual Studio, support EditorConfig. This allows setting custom preferences for developer-edited sources, including EOL. At the same time, tools like NuGet, DotNet and Visual Studio itself enforce specific conventions for some file types (i.e. MSBuild project files, NuGet specs).

Forcing developers to use Git's AUTOCRLF dissociates the actual content in the repository from what the editor/IDE sees. There is no benefit or need for this with contemporary software. In other words, it is better to directly work on the "source of truth".

Description

softworkz commented 4 weeks ago

I really hate being so rejective, I know very well how annoying this can be (making contributions to ffmpeg is extremely unpleasant), but the above is my experience. I've gone through it three times and I know what is going to happen.

brabebhin commented 4 weeks ago

If my understanding is correct, we will still have CRLF files around, right? Just the majority of the files will be LF?

Personally I'm not too concerned about diffing, good diff tools can be configured to ignore line endings and white spaces.

softworkz commented 4 weeks ago

Personally I'm not too concerned about diffing, good diff tools can be configured to ignore line endings and white spaces.

It's not just about diff tools. When merging or cherry-picking, Git will treat all lines as changed when merging something between "pre-eol-change" and "post-eol-change".

softworkz commented 4 weeks ago

There's even more to it. Unfortunately there's a lack of understanding of how Git is working in this regard.

All the file changes in this PR (which are supposed to change the line endings) are completely pointless!

That's because autocrlf has always been on at all of the contributors to this project. And that means that text files with CRLF are stored with LF line endings.

All the files that this PR is supposed to be changing ARE ALREADY STORED WITH LF IN THE REPO!

That's how autocrlf is working. If you want to store files with CRLF, you have to set autocrlf to false.

Proof:

  1. Read the docs
  2. Check out this entire PR (leaving autocrlf=on) - the files are still having CRLF (due to auto-conversion)
  3. Set autocrlf=false, do a fresh checkout of this repo (not the pr) - the files are having LF (that's how they are stored)
brabebhin commented 4 weeks ago

It was also my suspicion that the line ending change would be pointless, however, i think @JunielKatarn does have a problem he is trying to fix, and I'm trying to understand where this is all coming from.

JunielKatarn commented 3 weeks ago

I really hate being so rejective, I know very well how annoying this can be (making contributions to ffmpeg is extremely unpleasant), but the above is my experience. I've gone through it three times and I know what is going to happen.

Let me start with this. I am aware this is a controversial PR. Of course, nobody enjoys hearing arguments against what they post, but by no means am I taking this personally nor feel attacked. I appreciate the time taken into reviewing. If you disagree 100 times with my PRs, I'm completely fine with addressing it 100 times. That said, I expect for future PRs to be way less controversial. This should be the "worst" of them.

  • This will break all Git merging and cherry-picking across the boundary of this commit, i.e. between branches before and after or local branches. Instead of seeing the merged/committed changes, you will see diffs where all lines in a file have changed

This can indeed cause merge conflicts from other branches that have not been updated to this eventual baseline. This would be a one-off inconvenience, when it happened, if it happened. This all depends on how many non-rebasable merges or cherry-picks would happen to the files that this PR updates.

I would argue benefit exceeds the cost.

  • All users which are using/checking out FFmpegInteropX now and in the future have autocrlf set to on - that's the default on Windows for very good reasons. If somebody has made changes and realizes later that he should have set autocrlf to off, then he's totally lost, because that setting needs to be made before checking out

I need to controvert this. There have been no very good reasons to default to AUTOCRLF on for many years. Even Notepad supports LF endings. As long as we keep the committed content consistent from now on, no developer will be "screwed": Turning the setting off and using an EditorConfig-compatible IDE (meaning, any modern IDE or text editor) will prevent the issue.

  • ... Line endings on Wiindows are CRLF and especially those Visual Studio files are always having CRLF endings. It makes no sense to deviate from that. There's no benefit but a lot of trouble in the aftermath

That is actually something this PR addresses. Using AUTOCRLF keeps the committed content for some VS-specific (not Windows) files as LF. It is more consistent to commit LF files as LF and CRLF as CRLF. Windows itself does not really care about line endings, but only some apps traditionally.

  • I can guarantee you - that even if it would get merged, it will be reverted shortly after - as soon as the consequences will become clear

The existence of this PR is essentially my statement of the opposite. In some contexts, a change that causes future merge conflicts would make development unfeasible. Based on the size of this project, its development community and the activity on the repository, my take is that such inconvenience would be very isolated and infrequent.

JunielKatarn commented 3 weeks ago

It was also my suspicion that the line ending change would be pointless, however, i think @JunielKatarn does have a problem he is trying to fix, and I'm trying to understand where this is all coming from.

I am essentially proposing to have consistent work directory and committed repository content. The AUTOCRLF usage is based on outdated and false assumptions such as:

While there can be merge conflicts if and only if there were subsequent pull requests, I am suggesting the consistency/correctness benefit outweighs the one-off inconvenience when updating to/from/ older branches on a small set of files.

Note, the inconvenience above is already bound to happen because somehow one of the PowerShell scripts is actually checked-in as CRLF. When it gets edited, it will be converted to LF and the corresponding merge will look like the whole file is being modified.

If this change is merged, subsequent PRs based on this checkpoint will not face line-ending compatibility issues as the committed content will stay consistent.

softworkz commented 3 weeks ago

Of course, nobody enjoys hearing arguments against what they post, but by no means am I taking this personally nor feel attacked. I appreciate the time taken into reviewing. If you disagree 100 times with my PRs, I'm completely fine with addressing it 100 times. That said, I expect for future PRs to be way less controversial. This should be the "worst" of them.

Sounds good, I just wanted to clarify that contributions are very welcome, including from my side 😄

This can indeed cause merge conflicts from other branches that have not been updated to this eventual baseline. This would be a one-off inconvenience, when it happened, if it happened. This all depends on how many non-rebasable merges or cherry-picks would happen to the files that this PR updates.

It would affect myself for example and prevent me from submitting PRs. I branched off some time ago and there are some larger differences I have (different nuget packages, different build configurations and a number of other things). Yet - so far, this didn't prevent me from submitting PRs for imrpoving subtitle handling for example. With this PR, I wouldn't be able to do so in the future. It also wouldn't work if I'd apply the same like this PR does in my repo, because it wouldn't be at the same baseline. (and I cannot rebase because we're using a custom ffmpeg 5.1 version, so I cannot take all changes)

  • All users which are using/checking out FFmpegInteropX now and in the future have autocrlf set to on - that's the default on Windows for very good reasons. If somebody has made changes and realizes later that he should have set autocrlf to off, then he's totally lost, because that setting needs to be made before checking out

I need to controvert this. There have been no very good reasons to default to AUTOCRLF on for many years.

First of all, this is not something you can argue about. autocrlf=on is the Git default on Windows. It's what 99% of developers have as the default on Windows. Secondly, there are very good reasons to use autocrlf. For example, when it's off, all files that have CRLF will be checked in with CRLF endings. This can cause trouble when you check out those files on Linux or Mac.

Turning the setting off and using an EditorConfig-compatible IDE (meaning, any modern IDE or text editor) will prevent the issue.

No it won't. All templates in Visual Studio are using CRLF, same goes for auto-generated files - an .editorconfig setting doesn't have an effect on those.

  • I can guarantee you - that even if it would get merged, it will be reverted shortly after - as soon as the consequences will become clear

The existence of this PR is essentially my statement of the opposite.

I understand that you believe it would work out well.

In some contexts, a change that causes future merge conflicts would make development unfeasible.

Great! Then it burns down to that I'm saying that this is such a context 😄

softworkz commented 3 weeks ago

Note, the inconvenience above is already bound to happen because somehow one of the PowerShell scripts is actually checked-in as CRLF. When it gets edited, it will be converted to LF and the corresponding merge will look like the whole file is being modified.

When the file is checked in with CRLF and autocrlf=false, then it won't be converted to LF (neither when it's on).

softworkz commented 3 weeks ago

Oh and what's the practical benefit of all this?

The only thing you said is that it would be "better" if the line endings inside Git would be same as when checked out - but really: who cares about this? I have created and worked with hundreds of Git repositories over the years - mostly from Windows with autocrlf=on - and never encountered any problem with it. And that's what 99% of all developers are doing on Windows. I don't see any good reason to deviate from that for FFmpegInteropX.

Also, you surely know that people are rarely reading instructions. Especially NOT BEFORE cloning and checking out. But then it's too late for setting autocrlf to false (you can't have this setting included in the repo). When they do it afterwards, it ends up in a mess, because then they have all files already with CRLF and when autocrlf is off, they will check in changes with CRLF, so that the files inside Git actually have CRLF (which they don't have right now).

So, we would need complicated instructions, telling them how to revert the checkout, change the setting and redo the checkout. But what if they've already made changes at that point when they're reading this?

softworkz commented 3 weeks ago

And it's not just about new developers who are trying out FFmpegInteropX. It's also about all existing developers, because:

Let's assume that this PR would get merged into the winui-build branch (on GitHub).

Now, locally, we would have to

To fix this, we would need to delete all repo files in the file system (how to do that without deleting any local-only files...?) - and then discard the deletion. Only then, all files will be LF. And when we want to switch to a different local branch (without this PR), then we would first change autocrlf to on and then switch the branch and then - once again delete all repo files in order to get them as having CRLF.

And that would need to be repeated each time you switch from a branch with and without this PR.

But that's not all: Sometimes, the change to autocrlf will cause Git to consider files as changed (line endings), so you will also need to discard those detected changes in order for being able to switch the branch.

Have you even ever done a migration from an existing multi-user(!), multi-branch, GitHub-hosted, non-trivial Git repo from autocrlf ON to OFF actually?

brabebhin commented 3 weeks ago

Tbh in my experience so far on Windows development, it is people who have autocrlf off that cause the most line ending problems, and we (not ffmpeginteropx but my teams) have had a fair share of these over the years.

I personally don't see the urgency in having this PR merged. I don't mind either way. i nuke my working directory every now and then, and rarely have I needed to do cherry picks. The final word stays with @lukasf (pls don't hate me).

Some historical background on this:we forked off Microsoft ffmpeginterop project. We inherited most of the settings from them. I don't think there was any discussion on line endings, in fact I'm pretty sure we never really thought too much about this.

The fact that we are now a mature project, with live users as both forks and packages makes it harder to deliver breaking changes and we need to really consider cost-benefit for those. I don't see much benefits on changing right now. Again, I'm open to having my mind changed. I personally never had a line ending problem in this project.

lukasf commented 3 weeks ago

I also fail to see the problem you are trying to solve @JunielKatarn. It does not matter which line ending GIT uses to store its files internally, as long as I have zero problems with it and anyone else has zero problems with. You would also have zero problems, if you'd just set autocrlf like anyone else (either for this repo or better set it globally for all repos).

This PR creates work for us and it will create problems for anyone who clones the repo and has autocrlf enabled - which means, it creates problems for 99% of new users of the repo. It forces these 99% of users to switch git to a non-default setting, to allow it to work properly with our repo (and then in turn their git will cause problems with 99% of other Windows repos). It just does not make sense. You are trying to solve a problem that does not exist, and you cause a lot of new problems on the way.

In my last job, we also tried to switch away from non-autocrlf, and it was a lot of trouble, and cost quite some nerves to get it right. Setting editorconfig is not sufficient. If you want it to work properly, you also need to maintain a .gitattributes file, to override the clrf setting a user might have and force git to store certain files with a certain line ending. But then new problems come, because git might unintentionally change line endings in binary files (and you will only notice later), etc etc. I've been down that road, and I can tell you it's benn a bumpy, unpleasant one. And as I said, it does not solve a single problem. Back then, I was one of those who wanted to do the change. It sounds reasonable. But I changed my mind about it. It's totally not worth the trouble when things work so fine, without any work, just using out-of-the-box settings.

If I was to start a new repo, I could consider setting it up from the beginning with proper .editorconfig and .gitattrbutes (I sure wouldn't do it just for myself, but it's not so completely off, someone might convince me if they insist). But with an existing repo, with multiple users and quite a bunch of forks, it just does not make sense to me. There is no point in it. No problem to solve.

JunielKatarn commented 2 weeks ago

I also fail to see the problem you are trying to solve...

In a gist, what I was trying to do is to align checked-out content with checked-in content. The argument about "Windows being CRLF and Unix being LF" is false and outdated. Its core premise was always a lie, but that topic is not central to this project. I also have been able to leave out AUTCRLF by just using EditorConfig, but I understand the appeal to also enforce settings in .gitattributes.

I thought of it as a source management enhancement rather than solving a problem, but it can be a matter of perspective.

If you think the sporadic potential merge conflicts that could arise while other branches/forks catch up outweigh the potential benefits, I understand an can forget about this change.

My following PRs will be both more meaningful and less disruptive.

softworkz commented 2 weeks ago

The argument about "Windows being CRLF and Unix being LF" is false and outdated. Its core premise was always a lie,

You still don't see the picture clearly. This is not about Windows vs. Unix. This is simply about Visual Studio and that's the tool we are working with in this repo.

I also have been able to leave out AUTCRLF by just using EditorConfig,

.editorconfig doesn't really help here in the way you are thinking.

Just try these steps:

The result will be:

This always ends up in a mess - no matter what you do.