mawww / kakoune

mawww's experiment for a better code editor
http://kakoune.org
The Unlicense
9.96k stars 715 forks source link

kak always writes files with a newline at the end #2147

Open shachaf opened 6 years ago

shachaf commented 6 years ago
$ printf test > file
$ xxd file
00000000: 7465 7374                                test
$ kak -e 'wq' file
$ xxd file
00000000: 7465 7374 0a                             test.

Opening a file and writing immediately shouldn't modify the file.

Vim keeps track of whether a file ended with a newline on opening with an "endofline" flag. Kakoune could presumably do the same thing.

mawww commented 6 years ago

Kakoune is a code editor, not an arbitrary data editor. Adding an option to control that would be relatively simple, but I think it would go a bit against Kakoune's principles. Unix text files are expected to end with an end of line character, its usually an error when they dont.

The motivating example is not strong enough to convince me we need to support that. Kakoune does not guarantee that opening a file and writing it should be a no-op (should it touch the file modification date ? should it change the inode number as suggested in another issue ? I dont think Kakoune should make any promises about any of those).

That said, you could argue that BOM and end of line format support are already precedents in your requested direction, which is true, but I added them because I had to work on source files where those needed to be preserved. I am not sure we have real source code where the last line needs not to end with an EOL.

shachaf commented 6 years ago

There's plenty of source code that doesn't end with \n, though. I'm not sure which editors do that -- and they probably shouldn't -- but I run into it regularly. The files should probably be fixed but that's a separate issue.

More generally, even though this isn't primarily a code feature, I'd like to be confident that if I open a file and make no changes, or only make changes in a specific area, that the rest of the file doesn't get changed unexpectedly. Also note that writing empty files isn't supported right now (though looking at vim's handling for empty files, it seems like a special case anyway).

It's also a useful feature of vim that you can edit or examine binary data with :%!xxd and :%!xxd -r, though this is obviously not the primary use case.

This seems like a small feature -- and the only change necessary? -- to be able to open and modify an arbitrary file without causing unexpected changes.

tmccombs commented 4 years ago

fwiw, editorconfig does have an option insert_final_newlinewhich enforces either always ending with newlines or not.

lenormf commented 4 years ago

Related #3669. Seems like once the choices have been made, we'll have to document them.

JJK96 commented 4 years ago

I often have changed files in git where the only change is a newline at the end as a result of this behaviour. I would prefer to change this behaviour as suggested by this issue.

lenormf commented 4 years ago

If the premise that Kakoune is a code editor and is used to edit text files is respected, then the behaviour is consistent with what is expected by the standard:

A text file, under unix, consists of a series of lines, each of which ends with a newline character (\n). A file that is not empty and does not end with a newline is therefore not a text file.

I don't know the reason why other editors wouldn't terminate all text files with a newline character, or why users wouldn't want that standard behaviour to be applied to source code, but if Git is showing \ No newline at end of file, then it's the contents of these files you need to fix.

shachaf commented 4 years ago

Kakoune can make whatever decision it wants on this feature, but these arguments don't seem very relevant. A text editor is a tool that I use to deal with (mostly) text files, but I need to deal with text files as they are, not as a standard wants them to be. They might not be Unix-related files (e.g. Kakoune deals with \r\n fine), so Unix standards don't get that much of a say here -- the question is how it should deal with files that occur in practice. In practice there are source code files that don't end in \n everywhere, so "if you edit these files you must modify the last byte" is a strong stance to be taking.

A quick search shows noeol source files in cpython, CoreCLR, gdb, gtk, LLVM, SDL, vscode, etc. repositories. There are also many language grammars that allow files without \n, whatever POSIX might say.

(I should also note that, if Kakoune keeps not supporting this, it just means using vim a bit more often for me. My preference is to be able to use one tool rather than two.)

lenormf commented 4 years ago

Your "the argument is not relevant" can be replied with the exact same dismissal.

If you don't want a trailing newline character in files edited with Kakoune, remove it before committing. Following whatever inconsistency other editors put on their users is not a relevant argument.

mawww commented 4 years ago

It is still unclear to me why you'd need to preserve the missing end-of-line of a file, and it looks like a can of worms to me.

Say we modify the last line while editing, is it then fair game to add the end-of-line character ? expected ? How about if we add an empty line at the end of the buffer, shall we drop that last line because the preceeding line did not contain an end-of-line at load time ? I dont think there is any good, consistent answer to those questions, so I am inclined to let the core of Kakoune do the simple thing.

I also suspect this is not that hard to implemement using hooks, something like (untested)

hook global BufOpenFile .* %{evaluate-commands %sh{ 
   if [ $(tail -c1 $kak_hook_param | wc -l) = "0" ]; then
       echo 'hook buffer BufWritePost .* %{ nop %sh{ truncate -1 $kak_hook_param } }'
   fi
}

To be honest, the pragmatic part of me says to just merge the PR and add that feature, but its quite conflicted with my minimalist part...

Could you maybe expand on what use cases you have for Kakoune where preserving that missing newline is really important ?

occivink commented 4 years ago

I was also confused about what kakoune should do, but I think ultimately the "correct" behavior is to simply keep the style that the file had at opening, regardless of any changes. It's consistent and predictable. Anything more elaborate also runs the risk of being difficult to understand. It is still possible to implement fancier behavior with BufWritePre hooks which could change newline_at_eof, based on whether the last line is empty or not, ...

I personally don't really need this behavior, but the following arguments convinced me:

lenormf commented 4 years ago

What should happen if a user with newline_at_eof enabled opens a new empty file? Should the editor have a main selection set over a character that doesn't exist in the file? Should the editor now open the door to zero-length selections?

There's no practical use to spending time answering those questions, which are asked solely because “other editors implement the feature, therefore Kakoune should too”.

occivink commented 4 years ago

Should the editor have a main selection set over a character that doesn't exist in the file?

Yes? It's not like you see a 100% faithful representation of the file content anyway: the BOM byte is stripped if it exists, and \r\n is turned into \n. What exactly is the problem in doing so?

lenormf commented 4 years ago

You can't have nothing selected. That's why always adding a newline character to a file keeps the paradigm consistent, there's always something selected.

occivink commented 4 years ago

The editing paradigm is completely independent from how files are loaded and saved on disk, I don't think it's relevant to the discussion. The newline at the end will still always be there when editing, it just might not be saved in the file depending on an option. If you test out the PR you'll see that it doesn't change anything to the editing.

lenormf commented 4 years ago

The newline at the end will still always be there when editing, it just might not be saved in the file depending on an option.

conflicts with your previous statement:

the "correct" behavior is to simply keep the style that the file had at opening. It's consistent and predictable.

occivink commented 4 years ago

Ok, I thought it was obvious that this whole discussion was about how kakoune reads and writes files, but I suppose not.

tmccombs commented 4 years ago

It is still unclear to me why you'd need to preserve the missing end-of-line of a file,

shachaf commented 4 years ago

As far as empty files go, I'd suggest compatibility with vim, which saves a 0-byte file if you write an empty buffer (even though its internal representation is an array of lines, like Kakoune). This makes sense because \n is a line terminator, and an empty file has no lines to terminate. Other editors I tested (nano, gedit, vscode, vis) have the same behavior, so Kakoune is the odd one out here in being unable to create empty files. This seems like good behavior regardless of whether the noeol patch is merged.

tmccombs commented 4 years ago

Kakoune is a code editor, not an arbitrary data editor.

Is the implication here that if you want to edit text files that aren't code you need to use a different editor?

Unix text files are expected to end with an end of line character, its usually an error when they dont.

What do you mean by error? It might technically not meet the posix definition of a text file, but most unix tools (grep, sed, awk, sort, etc.) handle missing newlines at the end of files fine.

mawww commented 4 years ago

Is the implication here that if you want to edit text files that aren't code you need to use a different editor?

For text files, no, as written in the design document, Kakoune should be good at editing general text as a consequence of its focus on editing code. for arbitrary data files, yeah, Kakoune might work well enough for some tasks, but editing non-text files is not really its focus.

What do you mean by error? It might technically not meet the posix definition of a text file, but most unix tools (grep, sed, awk, sort, etc.) handle missing newlines at the end of files fine.

error might not be the best word, but text files missing the last newline are often treated as-if they ended up with a final new-line, and when they are not it leads to surprising result, like wc -l returning 0 for a file containing a single line with no final new-line. I dont know of any case where adding that missing newline when writing the file breaks things.

I still miss a clear motivating use case for this feature, you posted some examples of things that could be desirable, but thats not really what I am looking for, I'd like to hear about a real world case where not having this option has been making your work harder or prevented you from using Kakoune for that task.

For example, I cant quite see why Kakoune should be able to create empty files that contain no bytes, touch already does a good job at that. I wonder if adding a final new-line would be problematic when editing say a self-extracting shell script (shell-script followed by some tar.gz data), this could be a reasonably strong case of that feature.

I undestand it might sound like I am giving you a hard time for a pretty small feature, but Kakoune C++ codebase is already far too complex for my taste and seeing how hard it is to remove features, I am trying hard to make sure new ones are properly motivated.

caksoylar commented 4 years ago

I'd like to hear about a real world case where not having this option has been making your work harder or prevented you from using Kakoune for that task.

Here is an example that might not be very convincing since it is just an extra truncate call (although that might not be available in all systems). This is a light plugin for reading and writing gzipped files, where it would be handy to have the option to not have the buffer contain the EOL: https://github.com/caksoylar/dotfiles/blob/master/kak/autoload/gzip.kak#L30 (Also would be nice to be able to manually set the modifiable flag.)

zack-sampson commented 2 years ago

I'd like to hear about a real world case where not having this option has been making your work harder or prevented you from using Kakoune for that task.

I've just lost a few hours because I wrote hex API keys to files using kakoune, and then uploaded them as Kubernetes secrets. Worked fine when we were pulling them from env vars, but when we switched to mounting and reading from disk, the spurious newlines totally threw a wrench in our auth path. Took me a long time to figure out because the possibility that kakoune would be appending secret bytes wasn't even on my radar.

I'm no expert on text editors, but I find this behavior quite surprising, and the inability to disable it even more surprising. Especially in an editor that so clearly shows trailing newlines, I expect it to write what I tell it to write, not that plus some secret junk. I suppose it doesn't prevent me from using kakoune, since I can always use truncate or dd, but it's absolutely annoying 🤷

lenormf commented 2 years ago

What secret junk did the editor insert in files? When you open any file or an empty buffer, the final line has a newline character so the contents of the file on disk match what the UI shows.

Screwtapello commented 2 years ago
printf foo > foo.txt
kak -n foo.txt

The file on disk does not have a trailing newline, but Kakoune lets you select a trailing newline, and it does not report the buffer as modified. Unless you already know that Kakoune silently adds a trailing newline, it's reasonable to assume that if you make any other edit and write the file, the modification you're writing out is the one you made, not including "adding a trailing newline".

To be fair, Vim and GEdit behave the same way as Kakoune (buffer not modified, EOL added), although Vim reports "Incomplete last line" when the file is opened and can be configured to behave differently with the 'fixendofline' option.

zack-sampson commented 2 years ago

What secret junk did the editor insert in files? When you open any file or an empty buffer, the final line has a newline character so the contents of the file on disk match what the UI shows.

Sorry, "junk" was needlessly pejorative. The newline at the end of the file ends up getting written in the Kubernetes secret, even though it is not actually part of the secret I intended to write, and when the secret is used downstream it fails.

I understand what you're saying about showing that last newline, now that I look again. But I do really wish there were a way to disable it. otherwise I've got to trim on the consumer end, or remember to cut the spurious newline out every time I deal with secrets

mawww commented 2 years ago

It is good to hear about actual use cases for this, although I did not fully understand what you were trying to do and why that trailing newline was an issue in the given file. It is a wider issue than just how to write files, we have similar problems when piping, where Kakoune does add end-of-line to the selections being piped if they do not end with one (Behaviour I wanted to change until I discovered printf '1+1' | bc fails with GNU bc and that it is what the standard specifies...).

Its unclear to me why that final end-of-line should not be in the secret file, is that a binary file then ?

zack-sampson commented 2 years ago

Well imagine your service is talking to Plaid and you've got some API key, which is like 32 hex characters or whatever. If you're going to literally suck up the contents of that file verbatim and embed them in a URL, you need to remove the trailing newline, or auth will fail. I guess it's a "binary" file that happens to contain human-sensible data?

You could absolutely make the case that ignoring the final linefeed is kubectl's responsibility. If it's going to upload a secret from a "file", it should know that the trailing newline isn't considered part of the file. But it behaves the way it does, and I've run into software that acts in this way often enough that I would disable the implicit linefeed if it were an option. It's one of those things that happens often enough to be painful, but not so often that I can readily recognize the problem :(

EDIT: to be clear, the whole Kubernetes flow is

  1. Write the secret in a file, secret.txt
  2. Tell Kubernetes to create a secret named foo, with the secret being the contents of secret.txt
  3. In your k8s YAML configs, you can make the contents of foo available to the application, without having to embed the API key directly into the config
mawww commented 2 years ago

I would definitely consider this to be a kubectl or whatever uses the secret data issue, it makes zero sense to me that you'd want hex characters (so, human readable) instead of pure binary data then complain about an EOL in that file.

That said, many things makes no sense in the computing worlds and its not really a strong argument not to support the use case, real life is messy.

My main issue with adding support for that is the precedent it sets, it leads to lots of additional questions I dont have an answer to (what to do when we pipe, when we delete the last line, when we reload the file from disk...). I am worried about the amount of complexity trying to accommodate degenerate text files is going to bring.

clarfonthey commented 1 year ago

So, I encountered this again and want to ask again what the status is.

I agree that in general, trailing newlines don't matter for writing code, but Kakoune is very useful for editing a lot of text files adjacent to code, and it's particularly frustrating to not see the editorconfig option honoured, especially when Kakoune treats editorconfig as the canonical way to configure various editor options.

In my case, it was specifically for text that is printed verbatim, like a MOTD or <pre>-formatted text on a web page. In these cases, it's useful to be able to output files where the final character isn't a newline, since such newlines would show up as blank lines in the resulting output. Yes, these aren't necessarily code, but kakoune's ability to edit multiple selections at once is kind of unique and it's not worth completely switching up your flow just because of this one change.

Based upon #3724, there isn't a whole lot required to support this option. It doesn't seem to point toward a fundamental code restructure or anything, and most of that change is just tests. It's true that some people will use this function incorrectly (trimming newlines from source code files that should have it), but I don't think that preventing these users should be a blocker for enabling genuine cases where it's needed.

I do see the concern in terms of adding complexity to the editor, but like I mentioned, this seems like a pretty clear-cut case (it's supported by editorconfig, it has genuine uses, and it doesn't easily generalise to a lot of cases), and I don't think we should worry too much about a slippery slope of other (potentially bad) features creeping in. But if there are some compelling examples otherwise, I would love to see them! I've mostly skimmed this issue so I may have missed something, but my cursory glance hasn't revealed anything that contradicts my points here.


It's also worth noting, I'm 100% aware that PR I linked is old and the code has changed since then. It's definitely a nontrivial amount of effort to fix it up! But I mostly want to see a decision on whether we think this is an okay feature to add, since that can help inform whether people should put the effort into writing up such a PR.

krobelus commented 1 year ago

Vim keeps track of whether a file ended with a newline on opening with an "endofline" flag.

Interesting. Vim always writes the newline here (even with HOME=(mktemp -d) vim). Anyway, given that even Vim and Emacs are inconsistent about this, I think it's good to be pragmatic here and leave the file as-is.

Find my slightly different approach - that does not add an option - at https://github.com/mawww/kakoune/pull/4791

Screwtapello commented 1 year ago

Vim always writes the newline here

As I indirectly mentioned earlier, Vim actually has two options:

Actually, looking at Vim's documentation, there's a third option (binary) but it just overrides other existing options rather than doing anything different.

alanxoc3 commented 1 year ago

This is something that has always bugged me with kakoune that I have just been working around for a few years. Like others have mentioned, it comes up for me:

If it were up to me, I might implement this like so:

Kak would never add an extra newline to any buffer by default. There would be two possible ways to display the end of the last line of a file. If the file ended in a newline, it would give you the blue highlighted cursor with a "%" sign inside, indicating that this line ending is also the eof (similar to how zsh displays echo -n a). If the file doesn't end in a newline, an extra cursor position is made available as a red highlighted cursor with the "%" inside. This cursor position wouldn't actually represent a character. I'll call the former "newline eof" and the latter "hard eof".

Some more specifics:

And finally, it should be easy enough to add a hook that imitates existing behavior if you want that. Specifically: when you write the buffer, if the last character is not a newline, just add a newline to the buffer before writing.

krobelus commented 1 year ago

In git diffs when working on a shared code base, my commits often have that extra newline.

I agree, this is a prominent motivation

If you hover over "newline eof" and type d, it turns into the "hard eof". If you hover over "hard eof" and type d, it does nothing. If you hover over "hard eof" and type y, it clears the yank buffer (copying empty string). If you hover over "hard eof" and type a, text you input stays on the current line. If you hover over "hard eof" and type , the "hard eof" becomes a "newline eof".

sounds like this would require quite a lot of special cases in the implementation. More importantly, it pushes complexity onto the user by making commands behave subtly differently.

Screwtapello commented 1 year ago

Would it be easier to just not have a hard EOF marker? I mean, at the end of most lines there's an extra space you can select representing the newline; if there's no newline on the last line of the file, that space shouldn't be there, and the cursor would refuse to go past the last visible character. Would that be too subtle?

alanxoc3 commented 1 year ago

I think not having a hard EOF marker could work, I just don't know what it would look like for an empty buffer. (The cursor shouldn't just go invisible.)

And yeah, that idea does add to the learning curve a bit.

krobelus commented 1 year ago

yeah the empty buffer is a problem, more concretely the fact that the cursor always is on physical character (which may be \n) Helix has a different approach where you can place the cursor beyond the end of line even if there's no \n character).

I see two ways forward, either adopt something simliar to Helix' behavior, or my proposed change. Both are breaking. Is anyone using the truncate -1 workaround and would be adversely affected if that started chopping of the last byte of a file?

nonumeros commented 1 year ago

https://github.com/mawww/kakoune/pull/3724 is a no-op for the pre-processor under 11.2 throwing off an error regex_impl.cc:882:73: error: '::max' has not been declared; did you mean 'std::max'? 882 | constexpr auto max_instructions = std::numeric_limits<int16_t>::max(); | ^~~ . Perhaps 10.2 or earlier but I didnt try

/usr/include/c++/11.2.1/bits/stl_algo.h:3467:5: note: 'std::max' declared here
 3467 |     max(initializer_list<_Tp> __l, _Compare __comp)
      |     ^~~
make[1]: *** [Makefile:113: .regex_impl.opt.o] Error 1

Screenshot_2022-12-20_21-42-39-1

thumbs up for @krobelus with https://github.com/mawww/kakoune/pull/4791 thanks

@occivink thank you for the earlier fix though. If memory serves me well, kak stopped compiling somewhere around with gcc 10.2 or earlier.

stacyharper commented 4 months ago

Any update on this?