go-gitea / gitea

Git with a cup of tea! Painless self-hosted all-in-one software development service, including Git hosting, code review, team collaboration, package registry and CI/CD
https://gitea.com
MIT License
45.32k stars 5.51k forks source link

Integrated editor converts line endings to LF #9108

Open LukeOwlclaw opened 5 years ago

LukeOwlclaw commented 5 years ago

Description

We are using * -text in .gitattributes to avoid any changes when checking in or out from git. This works very well and we do not intend to change this behavior.

We thus have many files with Window line endings committed. When we now use the integrated Gitea editor, all line endings will be replaced with LF.

See this commit: https://try.gitea.io/Luke/Repo2/commit/a3ef603d463101239513288eb026ed55869e2490

Expected behavior

stale[bot] commented 4 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs during the next 2 weeks. Thank you for your contributions.

emngaiden commented 3 years ago

Im having the same problem. I implemented a statistics screen on my local server and this problem keeps making my data unreliable.

emngaiden commented 3 years ago

After 1 minute of digging, I found this part of the code on "routers\repo\editor.go" ... if _, err := repofiles.CreateOrUpdateRepoFile(ctx.Repo.Repository, ctx.User, &repofiles.UpdateRepoFileOptions{ LastCommitID: form.LastCommit, OldBranch: ctx.Repo.BranchName, NewBranch: branchName, FromTreePath: ctx.Repo.TreePath, TreePath: form.TreePath, Message: message, Content: strings.ReplaceAll(form.Content, "\r", ""), IsNewFile: isNewFile, }); err != nil { ... if I replace that strings.ReplaceAll(form.Content, "\r", "") with just form.Content, the line-endings don't change. However, I dont know for what propose that line was added, so proceed with caution.

zeripath commented 3 years ago

You are not supposed to have files in git repositories with non Unix file endings. So with this code we are essentially enforcing the git attribute:

* text=auto

Presumably you have set core.autocrlf=false on your local machines?

LukeOwlclaw commented 3 years ago

You are not supposed to have files in git repositories with non Unix file endings.

Where does this requirement come from? IMHO, a version control system should process files as they are, no matter what kind of line endings or encoding is used. And actually is this what git and Gitea do just fine. It seems that it is just the editor which rewrites the line endings.

Are there any unwanted side effects by the change that @emngaiden proposed? Can the change be applied?

lunny commented 3 years ago

The web editor should follow settings on .gitattributes about the line endings when insert a new line.

zeripath commented 3 years ago

@lunny:

The web editor should follow settings on .gitattributes about the line endings when insert a new line.

That would be a potentially useful improvement however, I suspect that this might actually be a problem due to the use of core.autocrlf=false, core.eol=native in the local git config without a .gitattributes.

@LukeOwlclaw:

Where does this requirement come from?

It's not a requirement but the recommendation comes from Git documentation itself. I'm not going to waste a lot of time looking for this but a text file is essentially defined as being LF ended. Lots of hacks have been added to handle Windows' particular proclivities on this matter, and similarly in SVN and CVS.

In Git's case, some of complexities go back to the early days of GitForWindows and its decision to introduce the core.autocrlf .

CRLF and LF conversion is extremely complex and every decision is wrong.

IMHO, a version control system should process files as they are, no matter what kind of line endings or encoding is used. And actually is this what git and Gitea do just fine. It seems that it is just the editor which rewrites the line endings.

That's fine for binary files however, if you want a file to be interpreted as text you need really need to think carefully about how you store encodings and line-endings. Git made a decision that it much prefers its text files to be UTF-8 with LF endings and a lot of git code makes that presumption. In reality its handling of file-encodings is pretty crap - try to do diffs etc with a UTF-16 encoded file.

SVN had a much more complicated system which was better in regards to this kind of thing but it was actually equally infuriating to use. Git's mechanisms are a lot more slapdash and poorly handled. Gitea doesn't even attempt to handle them as they're so rarely used that it's not even worth attempting to do it, we just attempt to detect the encodings ourselves.

Are there any unwanted side effects by the change that @emngaiden proposed? Can the change be applied?

Yes there are multiple side-effects due to this proposed change. There is no simple solution regarding file encodings or line endings - except everyone should use LF ending and UTF-8 encoding.


Anyway, I was trying to explain why we did the line-ending normalization in the first place.

It would be helpful to note that this CRLF issues are likely responsible for the weird conflict issues that some people suffer on merge. We have to do a line-ending normalization because editing on a windows browser will change the line endings to \r\n and editing on linux the other way round.

I'm not suggesting that we can't change this, but the question is going to be how to determine what the line-ending should be and whether anyone is going to be interested doing it.

It would be useful to know what the values of the config settings are locally and on the server:

and if you have a .gitattributes.

LukeOwlclaw commented 3 years ago

The .gitattributes in the repository I created for this issue contains: * -text There are no local settings as only Gitea's editor is being used.

An compromise might be to change editor.go so that it detects the type of line endings. Then it can enforce the same type of line endings for the whole file that was sent by the browser. Should be fairly easy to implement, shouldn't it?

Supplement: This is the commit which introduced enforcing LF line endings. It belongs to PR #3516 and issue #3513 which concern UTF8 and MySql.