charmbracelet / vhs

Your CLI home video recorder 📼
MIT License
15.13k stars 255 forks source link

Wait for command to finish before continuing #70

Closed jacobtomlinson closed 3 weeks ago

jacobtomlinson commented 2 years ago

Is it possible to wait for a command to finish before you continue typing.

I know I can use Sleep to wait but I might not know how long the command will take and it could be different every time.

demo

caarlos0 commented 2 years ago

this was discussed a bit in https://github.com/charmbracelet/vhs/issues/14, but I agree we should do something about it

maaslalani commented 2 years ago

Yep! Thanks for this @jacobtomlinson, we'll definitely introduce something for this since it's probably the most common usecase. It will probably look something like one of these but other suggestions are welcome.

Run "sleep 2 && echo 'First!'"
Exec "sleep 2 && echo 'First!'"
Execute "sleep 2 && echo 'First!'"
abhijit-hota commented 2 years ago

I'm working on a PR for this. Will this support shell scripts? E.g.,

Exec "./script.sh"

or

Exec@file "./script.sh"

cc: @caarlos0 @maaslalani

caarlos0 commented 2 years ago

IMHO, exec should:

so, IMHO there's no need to handle executing files, its just a Type ./file.sh; Enter

That said, not sure if we should or not support the Type options here too (like speed), but I guess we should

maaslalani commented 2 years ago

Another alternative is to have a Wait command that way we don't need to add all the options for typing speed and enter delays.

Type@100ms "./file" Sleep 0.5 Enter Wait
abhijit-hota commented 2 years ago

I tried implementing this but I am afraid I couldn't find a way to wait. We're using ttyd and AFAICT, there's no way of knowing the state of a process that is running inside ttyd.

Pointers are appreciated.

P.S., any reason we're using ttyd over GoTTY?

maaslalani commented 2 years ago

@abhijit-hota I had a similar concern. Is this possible with GoTTY? We can switch over to that.

I used ttyd since it looked more active and maintained.

abhijit-hota commented 2 years ago

Is this possible with GoTTY?

Doesn't seem so from the documentation.

I used ttyd since it looked more active and maintained.

Fair enough. Just checked that.


I have opened a support issue here: https://github.com/tsl0922/ttyd/issues/1009

maaslalani commented 2 years ago

My plan is to support both ttyd and GoTTY so we can use whichever is installed on the machine.

abhijit-hota commented 2 years ago

As per https://github.com/tsl0922/ttyd/issues/1009#issuecomment-1295718599, ttyd cannot tell us that.

We need to get the pid of the process started inside ttyd when pressing Enter to check the state.

maaslalani commented 2 years ago

I think we can solve this with pgrep:

If someone wants to Exec "sleep 5" we can pgrep "sleep 5" every few milliseconds (100ms?) to wait for it to finish.

image
abhijit-hota commented 2 years ago

What do you think about the method suggested here: https://github.com/tsl0922/ttyd/issues/1009#issuecomment-1295741037

maaslalani commented 2 years ago

Not sure we can use $! because we can't really type into the ttyd session since we are recording it. pgrep will let us look at the running processes outside the ttyd session which I think is desirable.

abhijit-hota commented 2 years ago

Makes sense. There is a chance of conflict between command when using pgrep.

jacobtomlinson commented 2 years ago

The other thing I was thinking is could we somehow set how long the command appears to take in the GIF, regardless of how long it actually took.

Some commands vary a lot in execution time, I just did a helm install which took ~30 seconds, but if the cache conditions are right it only takes ~5 seconds.

It would be great to tell vcr to wait until the command has finished, but also perhaps only show a fixed 2 second delay and snip out the rest.

Airblader commented 2 years ago

Another alternative would be specifying some condition that is polled until it is true, and until then further execution is blocked. But for "wait for command to finish" this is a bit of a complicated way, and might not always be possible (or at least easily).

normanr commented 1 year ago

Some ideas:

jacobtomlinson commented 1 year ago

Can't this be tracked via the process tree? When Enter is executed a new child process will be created, VCR could just poll until that process has finished.

Airblader commented 1 year ago

Would that also work for things like command &?

jacobtomlinson commented 1 year ago

Would you want it to? Adding the & usually means you want to continue without waiting.

Airblader commented 1 year ago

Exactly, but the process would still exist, so wouldn't VHS just keep blocking?

jacobtomlinson commented 1 year ago

Only if the user calls Wait.

I'm suggesting syntax like Type@100ms "./file" Sleep 0.5 Enter Wait that @maaslalani proposed. If you don't want to wait you just don't add the Wait.

Airblader commented 1 year ago

Ah, I see; you would still make it an explicit Wait action. Which of course makes sense, since there are blocking processes where you definitely do want to continue doing things (like opening vim).

dominiklohmann commented 1 year ago

I played around with VHS a bit trying to automatically generate small demo videos in CI for our documentation site that are always up to date. Without a way to wait for commands to finish that is not possible, as sleeping for a given time interval is not reliable enough when you try to create a video automatically. So :+1: from my side for this feature request.

kotborealis commented 1 year ago

I've tried to implement this feature in my implementation using PROMPT_COMMAND env variable to send signal from shell back to the controller, and it works nicely. Maybe it could be used in VHS too?

mastercactapus commented 1 year ago

I think the expect approach @normanr suggested would be the most universal.

It would solve things like:

The most significant difficulty I see is determining when to start waiting for the output. Waiting for a prompt would be intuitive, but it may not be evident that you'd need to expect all the previous "prompt> " lines to get where you want to; so maybe a ResetExpect or StartExpect command too?


I propose something like this:

Set ExpectTimeout 3s # timeout Expect after 3 seconds
Set ExpectTimeout 0  # disable Expect timeout (default)

StartExpect # Signals the start of where we should look for the output for Expect

Expect[@timeout] <string> # wait for a string in output since the most recent StartExpect or Expect (or beginning of tape)

Care should be taken so that Expect is interruptable (i.e., Ctrl+C) in case things go wrong :D.

Airblader commented 1 year ago

I would throw in that it isn't entirely universal to wait for some output. You might well want to wait for a process to exit without output or for some other state to have been reached. The latter can be considered out of scope, but the former does seem important to me.

mastercactapus commented 1 year ago

@Airblader hmm, that's a good point. Though when working with a shell, you could wait for the prompt to re-appear. Since it is likely the most common case, maybe we can handle it with a default?

Something like:

Set ExpectPrompt ">" # set's the default expect string (usually the prompt string)

Expect[@timeout] [string]

Though a naked Expect doesn't read as well.

Set ExpectPrompt "$"

StartExpect
Type "sleep 1" Enter
Expect

Maybe it's okay, but here's some other ideas:

Set WaitForDefault "$"

StartWaitFor
Type "sleep 1" Enter
WaitFor
Set WaitOutputDefault "$"

StartWaitOutput
Type "sleep 1" Enter
WaitOutput

Type "echo hi" Enter
WaitOutput "hi"
till commented 1 year ago

IMHO: Most universal would be to wait for an exit code.

mastercactapus commented 1 year ago

I can't entirely agree that an exit code is universal since commands can be interactive. For example, SSH may prompt for a password without giving an exit code before the input is required. But an exit code would work for most use cases.

That said, it's also a little challenging to go the exit code route because vhs isn't running the commands; instead, it's typing keys to a browser, which gets funneled to ttyd, and records the rendered canvas image it gets back. In other words, it's emulating a keyboard and screen rather than being a shell interpreter.

Something like what @kotborealis suggested could work but may be fragile, as it would only work if the shell is running on the same machine and only if we are not making a tape for a command line tool that's interactive.


I've been experimenting with this all day, and I'm trying to settle on an API that makes sense. I currently have it as Prompt but I'm thinking of changing it to Match and MatchAny. As I've been testing it, there are just too many edge cases so I went with a regex approach.

Match would allow you to specify a regular expression to match against the current line of the terminal (e.g., ^> or ^password:) and MatchAny would match against any string on the screen, allowing you to work with interactive TUI applications.

mastercactapus commented 1 year ago

Going to roll with this for the day and see how it feels, feedback welcome: https://github.com/mastercactapus/vhs/tree/match

Example:

Output "out.gif"
Output "out.ascii"

Set Match "^>" # aleady the default

Type "sleep 1; echo hi" Enter
Match # waits for the current line to match
Match "^>" # same thing, but overriding the default

Type "echo done" Enter

Sleep 1
mavam commented 1 year ago

This is certainly a step forward, but quite brittle for minimalistic prompts.

Is there really no way to get the shell execution status hauled all the way through? It may need some drilling to establish a side channel, but tracking PID and exit code would open the door for really robust waiting.

That said, such an operation probably breaks the keyboard/screen abstraction, so it comes at a cost.

mastercactapus commented 1 year ago

@mavam, of course, anything is possible, but it needs some extra thought/planning. For instance, I don't think we can use Type as there's no great way (that I'm aware of) to know if/when a subprocess starts as the result of a key press, if it was even because of the typed value, that it's the one we want, if it already ran and exited, what the exit code was, etc...


Context: How VHS Works - starts `ttyd` - `ttyd` spawns a shell when a browser tab is opened - `vhs` opens a tab - `vhs` sends keyboard events to the browser - browser sends events to `ttyd` - `ttyd` sends events to the shell - shell sends output to `ttyd` - `ttyd` sends output to browser - browser renders output with `xterm.js` into a canvas - `vhs` records the canvas and pulls the text buffer via javascript (for `.ascii`/`.txt`) ![image](https://user-images.githubusercontent.com/595010/204407851-da997f3b-16da-4fdf-b185-382253ad36fe.png) `ttyd` is just a proxy between the browser tab and the shell interpreter, but it doesn't interpret or process the input. It only provides a TTY and executes one subprocess (the shell). The shell is where we need to be to watch for process signals, new processes, etc...

I think there are two similar-but-separate problems people are here looking to solve:

  1. how to deal with complex interactions (e.g., prompts, dynamic delay) from a user-simulation perspective
  2. how to reliably handle and track subprocess execution (e.g., exit codes)

The first is straightforward (poll and wait for visible output to match) and is easy with the current architecture.

The second is a little more difficult since we'd need to hook into existing supported shells in a way that allows us to monitor and track subprocesses, and breaks the input/output abstraction.

To make it work reliably, a keyword like Run could be added that simulates a Type event but passes a command to be executed to completion.

It may be confusing that what is typed by Run executes in a different context than what Type does, but it could be documented as part of the verb (or maybe call it RunLocal or RunSh).

In any case, we would still need to build some out-of-band way to signal from the shell to vhs in a cross-platform way, ideally through ttyd AND xterm.js to do it solidly, either by updating each of those tools, or being creative.

IMHO, (and it is not my call, only my opinion) I feel like it may be beyond the scope of this tool, so I'm not sure if it's worth the complexity. However, I also have a biased opinion because polling the output has worked very well for my use cases.

If there is a subprocess/out-of-band-metadata feature added, though, I vote we call the feature closed-captioning. We could probably embed it in ANSI escape sequences.


I am curious, are there some examples that would be broken or impossible with only reading the output of a terminal?

mavam commented 1 year ago

Thanks for elaborating, this was (at least for me) the missing background information to have the conversation about the options.

Deeper shell integration sounds still great, but I understand it comes a significant cost of development. Having control about process spawning, returning, stdout/stderr, etc. would certainly round of VHS as a go-to solution for all declarative demo'ing, even making that part of CI and docs.

In the meantime, visual match hooks suit the bill. Fortunately it's possible to embed sequence of UTF-8 characters in prompts that are reasonable unique, and coupled with detecting them at the beginning of a line, this probably solves 90% of the issues.

lixvbnet commented 1 year ago

Going to roll with this for the day and see how it feels, feedback welcome:

https://github.com/mastercactapus/vhs/tree/match

Example:


Output "out.gif"

Output "out.ascii"

Set Match "^>" # aleady the default

Type "sleep 1; echo hi" Enter

Match # waits for the current line to match

Match "^>" # same thing, but overriding the default

Type "echo done" Enter

Sleep 1

Hi, I think this is a good approach. This is very close to SSH I/O automation, "Read Until", "Read Until Prompt", "Read Until Regexp" (link)

mastercactapus commented 1 year ago

I tried my proposed API for a while, and the Match/MatchAny combo IMO loses readability over time. Also, setting the Match expression saves on some typing but didn't make the tape file very readable (especially when coming back to a project after a while), so I removed the setting and made the command names more explicit:

MatchLine "regexp" # waits until the current line matches the regex
MatchScreen "regexp" # waits until the current screen matches the regex

I updated my fork with it and will try it with some projects and see if it feels like a good API before opening a PR.

ghost commented 1 year ago

@mastercactapus Just curious how your experience has been? I think this is the perfect solution for most of my use cases.

mastercactapus commented 1 year ago

@luke-hagar-sp It's worked really well; I just updated my branch with the latest to open a PR and will take it from there.