wryun / es-shell

es: a shell with higher-order functions
http://wryun.github.io/es-shell/
Other
307 stars 25 forks source link

Add `-l` flag to `$&pipe` to execute last command of the pipeline in the local process #109

Open jpco opened 3 weeks ago

jpco commented 3 weeks ago

This is a new version of the proposal in #55 meant to be more amenable to personal preference.

This PR adds an optional -l flag to the $&pipe primitive which causes it to run the last command in the pipeline in the local shell process rather than a subprocess.

The utility of this is to enable pipelines like:

; fn-%pipe = $&pipe -l
; grep '^MemTotal' < /proc/meminfo | sed 's/.*:\s\+\(.*\) .*/\1/' | total-mem = <=%read
; echo $total-mem
32440180

(This minimal example could easily be rewritten as total-mem = `{grep ...}, but more complex examples exist that would be much more difficult to rewrite like that.)

Unlike #55, this is an opt-in change (you have to set fn-%pipe = $&pipe -l yourself,) and, unless enabled, it doesn't change the behavior of any already-working %pipe invocation, because the rigid structure of $&pipe arguments otherwise wouldn't accommodate the -l in that position:

# es without this PR
; $&pipe -l {ls -1} 1 0 {wc -l}
usage: pipe cmd [ outfd infd cmd ] ...

# es with this PR
; $&pipe -l {ls -1} 1 0 {wc -l}
121

I revisited this after trying to implement it in pure es. I got as far as this:

let (fn-%p = <={%whatis %pipe})
fn %pipe cmds {
    if {~ $#cmds (0 1)} {
        return <={$cmds}
    }

    let ((pipecmds out in localcmd) = ()) {
        # if I had reverse ranges from #82 then instead of this for loop I could do
        #   (localcmd in out pipecmds) = $cmds($#cmds ... 1)
        #   pipecmds = $pipecmds($#pipecmds ... 1)

        for ((cmd o i) = $cmds) {
            if {~ $i ()} {
                localcmd = $cmd
            } {
                (out in pipecmds) = ($o $i) (($pipecmds) $out $in ($cmd))
            }
        }

        # FIXME: make return values work correctly
        %readfrom _pipefd {%dup $out 1 {%p $pipecmds}} {%open $in $_pipefd {$localcmd}}
    }
}

but I'm much less sure of the fiddly details of this version in terms of signal delivery and the like. This version also returns only the return value of the final command in the pipeline, rather than all of them, and I'm not sure how to fix that without resorting to using a modified version of either the /tmp-file or fifo versions of %readfrom within %pipe, which starts to feel somewhat excessive in terms of workarounds for such a core feature of a shell. I suppose an alternative could be to modify %readfrom to make the return value(s) of its input command accessible somehow.

jpco commented 3 weeks ago

(My version of %pipe here actually also doesn't do the right thing with the %dup -- for example, try it with echo foo |[2=0] lines = `` \n {cat} -- but I'm assuming I just got something wrong.)

The more I puzzle on this, the more I'm led to think two things: First, doing this via (a better, working version of) my version of %pipe up above is ideal. Second, a fully-functional version of %pipe improving what I have above is also impossible with the current set of primitives in es.

The main problem is that any halfway compatible version of %pipe in es must, in my opinion, meet three requirements:

  1. All commands must run in parallel, so that for example sleep 1 | sleep 1 | sleep 1 | sleep 1 must take about one second,
  2. All commands' exit statuses must be included, so that for example result 2 | result 1 | result 0 | result 3 must return 2 1 0 3, and
  3. All commands' communication should be over pipes or fifos, so that things like sigpipe and blocking work correctly.

I don't think the first two requirements are actually possible to combine in es script, given the limited parallelism of shells. You basically have to use a fork(2) to get parallel execution, which means you're going to lose the list of exit statuses, as the subshell will collapse them into a single exit status for the whole pipeline (and, in some cases like my %pipe above, the pipeline's aggregate exit status will even be thrown out!)

So, given all that, I see three ways forward:

  1. Some general mechanism to fix the problem I describe here, which requires more cleverness than I have at the moment -- something in the direction of coprocs?
  2. Primitive support like $&pipe -l.
  3. Don't do this at all and leave me in the dust :( I do think lastpipe is useful and common for a lot of users, though, so it seems worth trying to dedicate time to making possible.