Tab complete - Githubissues

SHCL needs tab complete.

Have you considered anything regarding how you'd like to implement this? I can see three ways to do this: either writing your own bindings to the readline library to control tab completion (i threw together a trivial example of that for a separate project and it wasn't to hard), use cl-readline (which ive no experience with), or roll something custom for shcl. I'd be happy to take a whack at it in my free time, but id love your input as your the project creator/lead.

Upon closer inspection, I see that editline is being utilized. I'm not familiar with it, and will take some time trying to understand what is being done with editline.

Thanks for the interest and enthusiasm! I’d love some help getting tab complete done. If tab complete worked then I’d actually be willing to use SHCL as my daily shell! Anything you can do to help me get there would be much appreciated!

I’ve been thinking about tab complete for a very, very long time! I haven’t put much detail about it online (since I don’t seem to have much of an audience), but tab complete has been my main goal for a long time. Most of the major changes I’ve landed in the last year have been in service of tab complete.

I have a few goals for how tab complete should work. Just rattling some stuff off the top of my head...

It must leverage the parser and lexer as much as possible. One of SHCL’s design goals is to bring the spirit of Common Lisp to the shell. Common Lisp supports changing the reader, so SHCL supports changing the lexer and parser. If the user changes the grammar then tab complete should keep chugging along without missing a beat. The same applies to the lexer, but I’m not as sure how to make it work.
Tab complete must provide context-appropriate suggestions. It should offer the “then” keyword if and only if the “then” keyword would be grammatically acceptable. This should be true even if the user modifies the grammar!
Tab complete must not have a “backdoor” into the rest of the system. If tab complete needs info from the parser, that info must be productized and made available for other uses. Said another way, tab complete is not a part of SHCL, it is built on top of SHCL.
It must be able to support command-specific completion logic. For example, it should be able to provide appropriate completions when it encounters dd’s if= argument.
The tab complete engine must have the ability to provide different completions based on the cursor’s position in the token being completed. It’s okay if that information isn’t leveraged by the actual completion logic, but it must be available to the completion logic!
It should be possible to provide token-specific completion logic. If the user adds a new token type to SHCL, they should be able to provide completion logic for that token. Furthermore, they should have all the same tools at their disposal that the built in token types have. Imagine tab complete providing lisp completions while the cursor is inside a lisp splice token!

I’ve got some basic completion logic in place now. I forget what the current state is (I’ve been deep in a different rabbit hole recently), but it’s getting quite close to having acceptable completion suggestions for file paths. It already knows how to get clues from the grammar about what sorts of tokens would be valid at the completion site. So, it can suggest reserved words and operators at all the appropriate times. I’m not sure if I like the interface for adding completion suggestions, but it’s working well enough for now.

As for interfacing with the readline layer, I’ve been trying to avoid focusing too much on that. I’m currently using libedit, but I’m not super happy with it. I want to move to something else (ideally, a pure-lisp alternative), but it just isn’t a priority for me right now. If you want to try switching to readline, cl-readline, or anything else you should (hopefully!) find it relatively easy. I tried to keep the terminal interface layer as separate and highly abstracted as I could. I’ve always seen libedit as a temporary solution, so I wanted to make it easy to replace it.

I’m more than happy to answer any questions to help you figure out the Byzantine parts of SHCL. Also, if you find it too confusing please let me know! I don’t want to be the only person who can understand this project!

I too would like to use shcl as my main shell, with tab completion being a huge barrier for that. As far as how to implement it, what I was thinking is (and this is only relevant to readline, i dont know whether this will work with libedit) to have generator functions, written in CL, which provide essentially a list of items one by one which are completion options. additionally there are configurator functions which set the generator function. additionally one can access the current line being typed with the function get-line-contents. If we store these functions in a dynamic variable, we can set whatever generator/completer we want, at any point in the process of reading a line from the user. Additionally, readline has a built in file path completion function, which follows paths.

This approach, however, I think doesnt quite hit all the points you listed above.

Im unsure how well it could use the lexer and parser. If there are functions to, for example, get all tokens related to X, it could be integrated easily, we would just query for what to complete for, passing the line so far (or previous word, or command, etc) to the lexer/parser and get back a list of all related tokens that should be valid completions.
This i think ties into the above. however, how exactly does the user change the grammar? is there a set of abstractions for this, or does it involve the user changing the source, or...? If its through a set of abstractions, then appropriate changes can be made to those abstractions to modify the tab completion bits.
I'm not sure how to achieve this with what ive got above – to build this all in CL, on top of SHCL, would mean rewriting readline in CL, unless im misunderstanding you.
totally possible, we just add some logic in the completer that states if we encounter dd, set *completer-function* to dd-completer, *generator-function* dd-generator. theres probably a better/cleaner way to do that, but it can be done.
I know we can get the current line from readline, and I'd assume we can get the current cursor position as well. then its just some math and string splitting to get our token, and do some logic with it to determine what to complete to.
I think this is just a matter of writing abstractions around how tokens are defined.

Of course, this is always more difficult than it seems, but Id say its totally doable.

What would you change in the above description? I havent looked at the lexer yet, so I'm unsure how plausible bullet one/two are, and bullet three is the place I deviate the most from what is wanted.

As far as the complexity, could you point me towards/tell me how the completion is currently handled? Like the entrance function, or entry points into the completion?

As far as the complexity, could you point me towards/tell me how the completion is currently handled? Like the entrance function, or entry points into the completion?

Sure thing! Here's what currently exists today. You can play around with the results using the -shcl-complete builtin... or just hitting tab. Whatever floats your boat!

At the libedit layer, there's a single function (shcl/shell/prompt::*tab-complete-fn*) that gets called to provide completions. That function is currently bound to a thin wrapper (shcl/shell/main::main-complete) around the real completion driver: shcl/shell/complete:completion-suggestions-for-input.

completion-suggestions-for-input uses the shell readtable (as provided by main-complete) to tokenize the input string provided by libedit. Since the lexer remembers the start and end position of each token, we can easily identify which token contains the cursor's position. That token is removed from the token sequence and replaced with a sigil token that shouldn't match any terminal node in the grammar. Then, it calls shcl/core/shell-grammar:commands-for-tokens to parse the token sequence into a shell syntax tree. The parse is expected to fail! When it fails, we'll get a list of parses that were attempted and details for why they failed. If we look for parse failures involving the sigil token then we'll have information about what sort of token would have been accepted in place of the sigil token.

Now we have

the token that is being completed
a list of of token types would be accepted by the grammar

With that, we can suggest reserved words (e.g. "if" or "esac") at the appropriate times. For context, each reserved word is given its own CLOS class. So, if the parser called for an instance of the esac class we know that the "esac" token could appear in that position. Unfortunately, that's not really what people want when they hit tab. They almost always want to complete command names and arguments to the command. For that, we need a bit more information. We need the list of tokens that are involved in the current command invocation. Its not enough to know that the parser was expected a command argument. We need to know what arguments came before it!

One of our greatest challenges (the user re-definable parser) is also our secret to success, here. We can hook into the parser and get it to save the information we need... such as the list of tokens that matched the simple-command-word type while parsing a simple-command nonterminal! That information will be preserved in parse error records. Now when we hit the sigil token we know more than the expected type. We also know which tokens earlier in the stream represent arguments to the command!

Phew. That's all a bit crazy, but it gives us a lot of stuff for free. For example, we'll treat all of these partial inputs as equivalent! Assume the cursor is at the end of the string.

"foo bar"
"biz buz; foo bar"
"while foo bar"
"foo < in.txt > out.txt bar" Irrelevant grammatical constructs that precede the token being completed (e.g. case 2 or case 3) are effectively stripped away, and tokens that don't represent arguments (e.g. case 4) don't get captured in the argument list.

Sweet! We managed to do all that with a relatively small assumption about the grammar and basically no assumptions about what tokens look like! We've now got everything we need to know about the input to provide high quality completion suggestions.

I'm pretty content with everything I just described. This is where things start to get a little bit... uglier. Everything from this point on is more of a proof of concept than a fully baked system.

Now its time to actually use the information produced during that analysis to produce completion suggestions. This is handled by the generic function shcl/shell/complete::completion-suggestions. This function is called every time a parse error involving the sigil token is encountered. It receives the following arguments:

the token type that the parser expected to find when it encountered the sigil token
the token that contains the cursor
an object containing some extra information producing during tokenization (e.g. cursor position within the token and shell readtable being used)
the parser's state when the error was encountered. Remember, this contains the simple-command-words for the command being completed! The completion-suggestions function uses a method combination strategy that is just a generalization of the append method combination strategy. So, each applicable method provides a sequence of suggestions and all of the suggestions from all the methods are joined together into a single sequence.

I've got methods that suggest

reserved words and operators,
executables found in $PATH or builtin commands,
and expansions for file paths with globs in them (e.g. "foo*" but not "foo")

The completion logic for file paths is still very much a work in progress. The other two are pretty satisfying to me.

I was planning on using my glob expansion logic (I wrote my own in shcl/core/expand) to facilitate completion of file paths. I mean, asking for completions of "foo" is very similar to asking for expansion of "foo*", right? Making that work required exposing a lot more of the internals of shcl/core/expand than I wanted to. I'm not happy with how its turning out.

I haven't thought much about how I would incorporate command-specific completers. I guess I was sort of imagining one of the completion-suggestions methods would look up the command name in some table and look for other completion functions there. Gross. This system is already complex enough! Also, that logic would end up being in the same method that currently handles completions of commands and the arguments to commands. Oh the humanity! That method is on track to host almost all of the completion suggestion logic!

what I was thinking is (and this is only relevant to readline, i dont know whether this will work with libedit) to have generator functions, written in CL, which provide essentially a list of items one by one which are completion options

One thing I don't like about the current system is its reliance on generic function methods. Conceptually orthogonal completers (e.g. completions for command names and completions for arguments) sometimes need to live in the same method. A design closer to what you describe (a list of completers) would be nicer in many ways. It would allow for better factorization and free the completers from needing to rely on generic function dispatch. That was a pain point in the current design. I had to add a metaclass for literal tokens just to make it work. Gross!

Realistically, there probably wouldn't be many completion sources, and so walking a list of all the completion sources for every completion request probably wouldn't matter. In all likelihood, it will be irrelevant compared to the overhead of running the parser! Maybe I over-engineered this part of SHCL. It wouldn't be the first time!

If we look for parse failures involving the sigil token then we'll have information about what sort of token would have been accepted in place of the sigil token.

This is awesome. all that needs to be done for completions is done just by defining the syntax/grammar itself. Correct me if I'm wrong, but here is what I picture is happening:

characters are typed
when tab is pressed, the line is sent off to the parser, which gives back a list of all the things that could have worked with the current line

I'm pretty content with everything I just described. This is where things start to get a little bit... uglier. Everything from this point on is more of a proof of concept than a fully baked system.

Personally, I like the idea of using readline's built in completion functionality to display and interact with the user regarding completions, as its a mature solution. The way I was thinking of doing it, by using two sets of functions, one with logic for which completer to use based on context, and one with logic to generate a list of completions¹, might be reduced to a single function returning completions to readline by sending text off to the parser!

The completion logic for file paths is still very much a work in progress. The other two are pretty satisfying to me. … I'm not happy with how its turning out.

Additionally, if readline is used, the built in file/path completer could be used, avoiding messing about with writing one.

As far as avoiding generic functions, one thought is to use one function (generic or otherwise) for determining what function should be used to generate a list for completing. This would make the system effectively static dispatch.

Another thought - perhaps it would be worthwhile to, if it is decided to go with readline, set up a simple completion interface for it, which would allow for work on multiple completion systems to occur simultaneously, with changing out a couple functions being the only thing necessary to change the completion backend.

I'm having a hard time thinking of how to combine the method of using the parser to generate a list of possible completions and doing things a more static dispatch way. Since the parser knows every bit of grammar already, it wouldnt make sense to use a set of completion functions as they would need to be organized e.g. by commands (a dd completion function set, an ls completion function set...).

On the note of gluing together readline and shcl, would it be easy-ish to write a function returning a list of strings/things-acceptable-in-place-of-sigil-token for a certain line and cursor position?

¹ by storing these functions in a dynamic variable, we can change the completer and the completer chooser from within a call to the current completer or completer chooser.

I’ve been following this with idle interest. May be worth checking out how stumpwm does it since we tab complete a number of things. I have most direct experience with the path, we store a hash table that is cached in the background on startup. As for actual completion candidates I think we take a page from emacs, but I’m not entirely sure.

David

On Jun 10, 2019, at 11:38 PM, Shozo notifications@github.com wrote:

If we look for parse failures involving the sigil token then we'll have information about what sort of token would have been accepted in place of the sigil token.

This is awesome. all that needs to be done for completions is done just by defining the syntax/grammar itself. Correct me if I'm wrong, but here is what I picture is happening:

characters are typed when tab is pressed, the line is sent off to the parser, which gives back a list of all the things that could have worked with the current line I'm pretty content with everything I just described. This is where things start to get a little bit... uglier. Everything from this point on is more of a proof of concept than a fully baked system.

Personally, I like the idea of using readline's built in completion functionality to display and interact with the user regarding completions, as its a mature solution. The way I was thinking of doing it, by using two sets of functions, one with logic for which completer to use based on context, and one with logic to generate a list of completions¹, might be reduced to a single function returning completions to readline by sending text off to the parser!

The completion logic for file paths is still very much a work in progress. The other two are pretty satisfying to me. … I'm not happy with how its turning out.

Additionally, if readline is used, the built in file/path completer could be used, avoiding messing about with writing one.

As far as avoiding generic functions, one thought is to use one function (generic or otherwise) for determining what function should be used to generate a list for completing. This would make the system effectively static dispatch.

Another thought - perhaps it would be worthwhile to, if it is decided to go with readline, set up a simple completion interface for it, which would allow for work on multiple completion systems to occur simultaneously, with changing out a couple functions being the only thing necessary to change the completion backend.

I'm having a hard time thinking of how to combine the method of using the parser to generate a list of possible completions and doing things a more static dispatch way. Since the parser knows every bit of grammar already, it wouldnt make sense to use a set of completion functions as they would need to be organized e.g. by commands (a dd completion function set, an ls completion function set...).

On the note of gluing together readline and shcl, would it be easy-ish to write a function returning a list of strings/things-acceptable-in-place-of-sigil-token for a certain line and cursor position?

¹ by storing these functions in a dynamic variable, we can change the completer and the completer chooser from within a call to the current completer or completer chooser.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or mute the thread.

interest here too.

I know we can get the current line from readline, and I'd assume we can get the current cursor position as well.

see rl:register-function :complete <custom completion function> which takes the currently entered text, start and end position in the readline line buffer as arguments.

https://github.com/vindarel/cl-readline-example/blob/master/src/readline-example.lisp#L67

@szos

This is awesome. all that needs to be done for completions is done just by defining the syntax/grammar itself. Correct me if I'm wrong, but here is what I picture is happening:

Yep! That's basically the foundation of it all! There's a bit more after that last step. After all, the parser doesn't know anything about file paths.

Personally, I like the idea of using readline's built in completion functionality to display and interact with the user regarding completions, as its a mature solution. The way I was thinking of doing it, by using two sets of functions, one with logic for which completer to use based on context, and one with logic to generate a list of completions¹, might be reduced to a single function returning completions to readline by sending text off to the parser!

libedit's support for interactive completion is definitely lackluster. I don't like how it throws its hands up in the air and redraws the prompt on a fresh line when you edit the input string. I have to imagine readline would handle it better.

Please note that SHCL is currently using a very permissive license (Apache v2), and GNU Readline is GPL. I'd like to keep SHCL's current license. So, I don't want to make SHCL use readline by default... but it would probably be kosher if SHCL knew how leverage a readline-compatible library that the user provides.

Additionally, if readline is used, the built in file/path completer could be used, avoiding messing about with writing one.

My main goal when I started SHCL was to learn about reader macros. I set my sights a bit higher after I realized that SHCL could actually be useful, but SHCL has always been a playground where I can learn. Saving effort isn't necessarily one of my goals. I'm more than happy to roll up my sleeves and write my own X if

I can make a complete, correct, and maintainable version of X
Creating X will help me learn something or enable some interesting new feature in SHCL

I'm not afraid to write a file path completer. I've already got the hard part done, and letting SHCL own the logic will enable better integration with the rest of the completion system or provide more surface area for user customization. So, I'm going to keep charging ahead writing my own path completer.

Also, I could be wrong (I haven't really looked), but I bet Readline assumes that the process's working directory should be used as the basis for relative paths. That's not true for SHCL. SHCL uses threads instead of forking for subshells, so the shell's working directory isn't actually the process working directory -- its a file descriptor that SHCL tracks! Trust me, I know that's crazy and that it makes everything a lot harder. I'm sticking with it until I encounter something that simply cannot be done this way. I'm guessing it will be signal handling.

As far as avoiding generic functions, one thought is to use one function (generic or otherwise) for determining what function should be used to generate a list for completing. This would make the system effectively static dispatch.

What you just described reminds me an awful lot of compute-discriminating-function. It seems like we're going to end up building our own method dispatch system. I'm fine with that! It seems like another excellent opportunity to learn! Why is type-based dispatch failing us here? What would serve us better? Are there other places where we can apply similar reasoning? These sound like fun questions that I'm excited to find answers to.

I don't think static dispatch would serve us well, since some completion sources will only be relevant if some runtime conditions are met (e.g. use dd completer when completing an argument to dd).

Another thought - perhaps it would be worthwhile to, if it is decided to go with readline, set up a simple completion interface for it, which would allow for work on multiple completion systems to occur simultaneously, with changing out a couple functions being the only thing necessary to change the completion backend.

Absolutely!

On the note of gluing together readline and shcl, would it be easy-ish to write a function returning a list of strings/things-acceptable-in-place-of-sigil-token for a certain line and cursor position?

I mean, its hard to produce good completion suggestions... but its not hard to write your own completion suggestion source. Just replace the completion-suggestions generic function with whatever you want or add a new method to it.

For example, here's the method that handles literal tokens. Here, desired is the token class that the parser expected to find when it failed to parse. So, desired would be something like the esac class or the fi class. token is the token that contains the cursor.

(defmethod completion-suggestions concatenate-sequences
    ((desired literal-token-class) token context parser-vars)
  (declare (ignore parser-vars))
  (let ((desired-string (literal-token-string desired))
        (token-value (token-value token)))
    (if (sequence-starts-with-p desired-string token-value)
        (list (make-simple-completion-suggestion desired-string context))
        nil)))

There's another method that handles command names, command arguments, and bare words (e.g. "for var in some words").

Does that answer your question? I'm not sure if I exactly understood what you were getting at.

@dbjergaard

I’ve been following this with idle interest. May be worth checking out how stumpwm does it since we tab complete a number of things. I have most direct experience with the path, we store a hash table that is cached in the background on startup. As for actual completion candidates I think we take a page from emacs, but I’m not entirely sure.

I've done some spelunking through stumpwm, but I haven't yet looked at how it handles completion. I guess its working well since I haven't felt the need to dive in and try to change it! I'll check it out!

Hi, Sorry i dropped off there, life kicked up some stuff that needed my attention.

As far as GNU readline goes, I think just linking with it mandates that the code be GPL'd. I'm not an expert and could be misreading the GPL, but that was my understanding. I think the benefits of readline don't outweigh the cons of changing the license, but that is entirely up to you.

it would probably be kosher if SHCL knew how leverage a readline-compatible library that the user provides.

I'm not sure if thats the case - the GPL makes it sound like even linking to a GPL'd application/library mandates GPL-ing the code, so any readline compatible library would have to be GPL'd, and then were back at SHCL linking with a GPL'd library again. Perhaps someone better versed than I in GPL and legal terms will chime in.

libedit's support for interactive completion is definitely lackluster. I don't like how it throws its hands up in the air and redraws the prompt on a fresh line when you edit the input string.

Given that libedit is lackluster I started thinking about alternatives - specifically alternatives that would deal with redrawing better. First is doing this from scratch, sending terminal escape codes and moving text around and all that jazz. This would need some sort of print function for completions which tracks how many lines are printed and clears them, then scrolls up, after displaying completions. This can be done with terminal escape codes (I think the VT100 escape codes are the most ubiquitous) like \033M to scroll. After spending 1/2 an hour exploring this I think it will be very tedious to do things this way, but it could be abstracted away. The only problem I ran into was moving the text on the screen - I couldnt find a way of doing that. If there isnt a way to do it then I think this method will require lots of clearing the screen and redrawing to keep up contiguity in the text, which in turn requires keeping a record of everything printed to the screen. EDIT: just thought of this: if scrolling the text is impossible, one option would be to the cursor back to where the text to delete starts, and then clearing from there down. This would keep contiguity in the printed text, at the cost of introducing blank space in the terminal window (which would be written over by later printings to stdout). I think this is the only way to do this, which is, i think, why readline doesnt bother clearing the completions. This is because you cant scrollback text in a terminal, you can only draw, and move the cursor. thus ncurses. I spent a bunch of time working on a program that cant work the way I envision and was figured out years ago, but i got a nice look at the history of terminals, which was interesting

This led me to think of ncurses. It would allow window abstractions, so one could, upon getting a TAB character, open a new 'window' and do completions there, and upon finishing completion, delete the window and add the text at the end of the line. The drawback of using ncurses is that you HAVE to use ncurses, one cant just switch between regular printf(); and ncurses printer.¹ If using ncurses, it would still necessitate managing the line within SHCL, as with the other method. This would be the most 'pretty', but I'm unsure how well it would work as a shell - there are unanswered questions such as 'would SHCL be able to cleanly hand off control to another ncurses application if its implemented using ncurses?'. There are other problems, like scrolling, which would need to be figured out, but they seem possible as evidenced here: https://gist.github.com/alan-mushi/bdc831e0c33ad5db8025

As far as managing the line, I figured the following would be a decent way of reading character by character and controlling based on characters (its written in scheme psudocode cause I've been using guile lately):

(let loop ((curchar (read-single-char))
               (before '())
               (after '()))
  (cond ((char=? curchar #\newline)
              (list->string (append (reverse before) after)))
            ((has-dispatch-sexp curchar)
             (let ((forms (get-dispatch-sexp curchar)))
               (eval `(progn ,forms))
               (loop (read-single-char) before after)))
            (else
             (print-to-term-line curchar)
             (loop (read-single-char) (cons curchar before) after))))

This would loop through until we terminate with a newline, and would provide a facility to do different things when fed different characters via the dispatch-sexp functions. These functions could interface with a hash table or alist, containing characters as the key and a set of forms as the value. These forms would directly manipulate before/after and call the appropriate functions to manage displaying that on the screen. some examples:

(define-character-dispatch #\Left
  (move-cursor-back)
  (setf after (cons (car before) after))
  (setf before (cdr before)))

(define-character-dispatch #\Backspace
  (delete-char-from-term-line)
  (setf before (cdr before)))

This would avoid generic functions, while maintaining some form of dynamic dispatch. Using hash tables as storage would ensure O(1); I assume one/the reason you'd prefer to avoid generics is speed. One nicety is that it will only have to handle a small number of characters, which keeps our dispatch table small.

I'm currently working on a small print-and-delete program using VT100 escape codes to see if its possible, I'll let you know what I find in writing it, and if its useful it could be reimplemented in SHCL.

¹I think its technically possible, it just takes a lot of extra plumbing with fflushing and clearing the screen and all that.

CC'ing myself, as I'm curious. Currently happy with the rc shell, but grass... fence... other side...

Given that libedit is lackluster I started thinking about alternatives - specifically alternatives that would deal with redrawing better.

For a quicker approach than yours, linecook looks like a good alternative...?

Thanks for expressing interest! Coming back to this project and finishing it up has been on my todo list for, uh, let’s see… oh… shit. It’s been a long time.

I think about working on it again a few times every week. I still fully intend to finish off this project — I’d love to be able to actually use the shell I wrote. The last several years have just been a lot, and this project is big enough that I don’t think I can pick it up again unless I make it my main focus for a few days.

I am looking to take some serious time off work starting in 2024, and returning to shcl is pretty high on my wish list for my time off. So, with any luck, you might see me return to this project. And, with even more luck, you and I might even be able to actually try using it as our daily driver shell! Imagine that!

Ha! Looking forward to it. A few of my pet projects have been "almost started" for a decade now, so I understand your position. I wish I had more time to write more code for more people, but all I can do right now is sit on the sidelines and cheer you on.

SquircleSpace / shcl

Tab complete #8