Minor overhaul of history writing

jpco commented 9 months ago

With this PR, the way history writing works is like this:

During command input, the input line is buffered in memory
When %parse returns a full parsed command, it also returns the input that produced the command
%interactive-loop passes the input to the function %write-history
If readline support is compiled in, %write-history puts the input in the readline history log and asks the history library to write it to the $history file (or, technically, whatever was passed to $&sethistory most recently).
If readline support is NOT compiled in, then there is no $&sethistory or other built-in history behavior, and %write-history is:

fn %write-history cmd {
  if {!~ $history ()} {
    echo $cmd >> $history
  }
}

Buffering the full command input until returning produces much nicer behavior for multi-line commands than the line-by-line logging that es did previously. With readline on, it makes scrolling up into history for multi-line commands pleasant; with readline off, it still helps prevent "interleaved" commands in the off chance that the user is writing two multi-line commands in separate terminals at once.

Readline is fully used for history when support is compiled in. This is less "dodgy" than the alternative ( :) ), and means things like history_write_timestamps can be used, which then makes multi-line commands work nicely even across es invocations.

Having a hook function makes it pretty easy to do things like imitate HISTCONTROL or even get fancier (unset $history and use a networked history server? Something like https://github.com/atuinsh/atuin?) I don't really need to waste bytes trying to convince es people about how nice hook functions are, though.

Some caveats and alternatives to consider:

Bikeshedding %write-history and everything else is very welcome. I don't know if that's the best name, but I'm too deep in the code at this point, so fresh eyes would be very helpful here.
Despite making it possible to enable history_write_timestamps with this change, I didn't do so, since it would make history files both different and a bit uglier. We could turn this on now and make things extra shiny right away.
The behavior of %write-history without readline is a little less smart than before, because now it will just blindly create anything $history might be set to (and it won't unset its internal idea of $history on an error). I'd try to make this work in-shell, but there's no way to open O_APPEND without also O_CREAT. An alternative could be to keep $&sethistory and $&writehistory without readline, and have them just imitate the current behavior exactly. EDIT: I'm a dummy. I forgot you could just use access to test whether the file exists.
In the opposite direction, $&writehistory could be modified to take the history file as an argument, as in: $&writehistory [FILE] [COMMAND]. This could be used to get $&sethistory out of the shell.
%write-history is currently run before the command is executed, and there is no protection in %interactive-loop from it throwing an exception. Therefore if you do something foolish like set fn-%write-history = throw error %write-history, you can just break your shell. I didn't add any protection for this for the purposes of not growing %interactive-loop too much, but I do want to call out the omission. We could also put the %write-history call after the command is executed. I think it should go before, but it would be a simpler fix than some fancy unwind-protect thing.

Hopefully the individual commit messages make it obvious what's going on in each file.

jpco commented 9 months ago

Another potential design for readline-history primitive(s) would be to have just one $&history primitive with several "subcommands". For example:

$&history log $cmd
$&history setfile $file
$&history setpos $n
$&history list

and so on. This could be used if someone wanted to write a history command like exists in other shells, without filling the primitive-space too much. But it would make the $&history primitive a real complex beast. Just an idea.

memreflect commented 9 months ago

I like the buffered command input and the write to the history file even when an error occurs.

I also am grateful that you did not enable timestamps. There are other methods of preserving newlines without breaking existing line-based history entries, such as writing \\n and \\\\ escape sequences or splitting the command on \n and using the equivalent of a %W conversion to write control characters in place of the newlines.

I'm not so sure about trying to remove $history and $&sethistory. $history makes history file code conveniently easy to write, and most users would simply do something like the following in their .esrc if hooks are the way to go:

history-file = /path/to/history/file
fn-%write-history = $&writehistory $history-file
fn %read-history {
  $&readhistory $history-file
}

But i recognize that someone might want to do something else with history, which hook functions would facilitate. As for $&sethistory, it is necessary unless you create a $&readhistory primitive for readline's sake.

There are definitely more history operations that can be moved into es code if hooks are desirable. Readline's read_history() is just a loop over lines of a file, and append_history() can pretty much be accomplished using >> filename. That also would help transition to a format that supports multi-line commands. Primitives might be handy for encoding and decoding multi-line commands for performance and memory reasons, and another primitive is required to invoke add_history(cmd) for readline's sake. Lastly, you would need some way to get the lines (or perhaps even store them in an es list and keep things synchronized with the line editor using a settor) to be able to write them to a file in the first place.

And because es can do all of those things itself with the help of some new primitives, i'm curious what you want out of a $&history primitive; also, how would hooks work with the subcommands? $&history setfile $filename for example would be more simply written as history = $filename. The alternative to both of those would be something like $&history output $cmd >> $filename or $&history append $filename $cmd, but that would require a user to create a %history-append hook or something for you to check and invoke on each command. I do like the idea, but i wonder if you had something specific in mind when suggesting it. I already mentioned four history-related primitives in the previous paragraph, so i can understand why you might want a single "one command to do it all" primitive that sort of groups the related operations together and delegates those operations to subcommands.

jpco commented 9 months ago

As for $&sethistory, it is necessary unless you create a $&readhistory primitive for readline's sake.

Oops, I missed that. Good catch.

There are definitely more history operations that can be moved into es code if hooks are desirable.

I already mentioned four history-related primitives in the previous paragraph, so i can understand why you might want a single "one command to do it all" primitive that sort of groups the related operations together and delegates those operations to subcommands.

This is exactly what I'm picturing. If you wanted to, say, duplicate the whole behavior of a robust history command, with a good amount of https://tiswww.case.edu/php/chet/readline/history.html#History-Functions exposed as individual primitives, you'd be adding quite a lot to the output of echo <=$&primitives. A single $&history primitive with sub-commands would help keep that cleaner. Then you could wrap those calls in hook functions in a similar pattern as $&openfile has in the shell now.

However, I'm not really convinced I even want all that flexibility.

$&history setfile $filename for example would be more simply written as history = $filename.

In this scenario, I would want initial.es to contain set-history = $&history setfile so that when a user runs history = $filename it would alert the readline library by default.

But all of this is a bit hypothetical, and I only wanted this PR to be a minor overhaul of history writing :)

Do you have any comments on the PR as written? I'm not 100% sure which of my harebrained ideas

I'm not so sure about trying to remove $history and $&sethistory.

is referring to -- is that about what I've already written, or one of my proposed alternatives?

memreflect commented 9 months ago

Do you have any comments on the PR as written?

I initially lamented the change from an always-open file descriptor to repeated opening and closing of a file by name. The file descriptor was a better option if you ignore the fact that something like rm -f $history means subsequent writes to a history file in that shell would write to the open file description of a file that no longer exists, but the new design ensures that the file is always written, even after a $history file is removed, which is a more predictable behavior.

That minor detail aside, the changes to %parse, the buffering of command input, and the writing of all commands, including those that cause syntax errors, are all welcome choices, which is pretty much this entire PR, so +1 from me.

I'm not 100% sure which of my harebrained ideas

I'm not so sure about trying to remove $history and $&sethistory.

is referring to -- is that about what I've already written, or one of my proposed alternatives?

One of your alternatives suggested that $&writehistory [FILE] [CMD] could be used to get $&sethistory out of the shell. For some reason, my brain mistakenly processed $&writehistory [FILE] combined with the removal of $&sethistory as a complete replacement for $history as well. That is definitely my blunder.

wryun commented 9 months ago

This is the sort of quality of life improvement I'm pretty keen on (like the tests thing).

However, given that it affects run-time functionality, I don't want to merge it until I've had a chance to go through it fairly carefully. Hopefully in the next month :) (pre-commitment!)

jpco commented 8 months ago

(Sorry, trying to clean up some old branches, I took this one out as well. Restored.)

wryun commented 5 days ago

Sadly, I don't think you should wait on me for this. I'm happy for you to merge this if you're confident it won't break anything.

jpco commented 3 days ago

Looking back at this with fresh eyes, there are a few things I want to change:

(back-compat) no-readline %write-history should test access -w $history and handle the false case by unsetting history.
(back-compat) a line like # hello, which produces no actual code, should still get logged to history.
(bug fix) handle non-syntax error errors from $&parse correctly (e.g., incomplete here document errors).
(style) encapsulate history writing logic (including exception-handling) better, possibly within %parse. (Doing it in %parse is tricky because %parse is run by %batch-loop as well.)

In addition, it's possible that in the future with something like runflags from #79, the -v flag could be implemented in all-es script with the new return value from $&parse added here (after all, what is -v but logging history to stderr?) So I'd like it if this design were speculatively "forward-compatible" with that. In practice, I think that just means that $&parse needs to make sure to return everything it reads in in the new return value. I'll need to see how far from that we are now.

I'm also hoping that this design can be compatible with some future input system that actually allows es code to be used while reading command input, but that would require a lot of novel design so it's hard to prepare for now.

jpco commented 3 days ago

Also, I'm realizing now that the return values of $&parse are... terrible in general with the PR as it is. Sometimes $&parse turns a line (like # foo) into a null command. Sometimes $&parse returns a command but no input due to not running interactively. Sometimes it returns both command and input, and sometimes neither. Because of that, you can't actually tell what any element of its return value actually is. It works as used in initial.es today, but that's a pretty terrible experience. Instead $&parse should always return its input, even if it's just ''.

jpco commented 2 days ago

(Score one for running CircleCI tests with -DGCDEBUG=1 -- it caught a memory error that wasn't getting triggered on my machine!)

I have refactored $&parse so that now its return value(s) always start with a single element representing the input it ingested to do the parse. $&parse also includes that input when it throws an error exception -- error $&parse INPUT MESSAGE is the format in that case. %parse "consumes" that input when handling returns and exceptions, similar to how %backquote "consumes" the status element that $&backquote returns.

This is nice because it's a simple and consistent behavior from $&parse, which I prefer from primitives. This is a bummer because it means that in non-interactive behavior, $&parse still buffers and returns its input only for that input to almost always be immediately thrown away. There are ways to remove that wasted work, but they all sacrifice simplicity by making $&parse more complicated or adding other primitives.

A couple questions:

I'm not sure how I feel about the current way that %parse is implemented -- in particular that it tests %is-interactive every time it runs, despite the fact that %is-interactive changes only rarely and in controlled ways -- this way seems like more wasted work, especially during %batch-loop. Does it make more sense to lexically bind a simpler %parse to %batch-loop in initial.es?
I don't feel entirely confident whether it should be %parse's job or %write-history's job to avoid writing empty lines to history. I decided to put it in %write-history to give the user more flexibility, but ... will anybody actually make use of that flexibility?

wryun / es-shell

Minor overhaul of history writing #65