elves / elvish

Powerful scripting language & versatile interactive shell
https://elv.sh/
BSD 2-Clause "Simplified" License
5.69k stars 300 forks source link

When value-consuming functions in a pipeline terminate, value-producers do not stop producing output #951

Open zakukai opened 4 years ago

zakukai commented 4 years ago

It is somewhat common in shell programming to combine value producers with potentially unbounded output with consumers that will process just a limited amount of that input according to some internal criteria, and then stop processing (or even reading) additional input. For instance:

find ~ | grep "$pattern" | head -n 15 # Find first 15 instances of files that meet criteria primes 2 | head -n 20 # A bit more of an artificial case: get the first 20 primes

Once the consumer on the right-hand side has got everything it needs, it closes its input, causing the preceding item in the pipeline to terminate with SIGPIPE next time it tries to write data, or otherwise detect that its output pipe is no longer valid and stop producing values.

This is somewhat equivalent to Haskell's lazy list evaluation:

take 10 primes

The "primes" function would produce (on demand) an infinite list, but "take 10" would force evaluation of only the first 10 values. The pipeline paradigm is similar in behavior (but more "greedy"): a producer may produce more than is needed, but ultimately it will terminate when the pipeline is broken.

Elvish does not behave this way: it seems that elvish functions do not produce a pipe break when they terminate, or respond to a pipe break on the byte stream or the value stream when the consumer in the pipeline terminates: range 100000000000 | take 2 (Not only doesn't terminate quickly, but I can't seem to CTRL-C out of it) e:primes 2 | take 2 (Yields the 2 values but hangs as "primes" continues running) while $true { echo $x; x=(+ 1 $x) } | take 2 (Yields the 2 values but hangs) while $true { echo $x; x=(+ 1 $x) } | e:head -n 2 ("head" consumes 2 values and terminates, but the loop continues)

Personally I see this as a critical part of using shell-style pipelines as a programming paradigm. Obviously the hanging isn't desirable but beyond that the SIGPIPE behavior produces a kind of non-optimized method for limiting how much output a producer generates. As such it's an important piece of the pipeline paradigm.

krader1961 commented 4 years ago

In the case of take the behavior you're seeing is because the current implementation consumes the entire input; although the documentation isn't clear on that point . Issue #923 is an open proposal to change that behavior. So that's basically "working as intended" at this time.

The other example involving an elvish block on the pipe LHS and an external comment on the RHS should definitely be changed so the block terminates when the pipe is no longer writable. The devil is in the details. Given the design of Elvish it should presumably involve the echo throwing an exception.

krader1961 commented 4 years ago

@zakukai, I think this issue is redundant in light of #923 and #952.

zakukai commented 4 years ago

I don't think it is, actually:

952 is a separate question on how the shell should handle exit codes, particularly with respect to traditional shell programs that respond to SIGPIPE in the default way. (Do value streams even use pipes? I thought they were just passing data around in the shell's process memory.)

923 addresses the issue with "take" but this isn't limited to "take". I have an example using "head" - and if you like here is an example using "each":

e:primes 2 | each [x]{ if ( > $x 31 ) { fail narf; }; put $x }

This is not limited to interactions with external, line-oriented tools either:

> fn fib [a b]{ put $a; fib $b (+ $a $b) }
> fib 0 1 | each [x]{ if ( > $x 31 ) { fail narf; }; put $x }
▶ 0
▶ 1
▶ (float64 1)
▶ (float64 2)
▶ (float64 3)
▶ (float64 5)
▶ (float64 8)
▶ (float64 13)
▶ (float64 21)