dspinellis / dgsh

Shell supporting pipelines to and from multiple processes
http://www.spinellis.gr/sw/dgsh/
Other
323 stars 22 forks source link

Timeout for negotiation when run interactively from dgsh #103

Closed trantor closed 5 years ago

trantor commented 6 years ago

Yes, not a great description, I realize.

Example below.

bash $ cat match1.dgsh
MITTENTE=$1
shift

tee|
 {{
         grep -F 'info: header Subject: ' |
         grep -P ' from=<\Q'"${MITTENTE}"'\E>' |
         grep -oP '[[:xdigit:]]+(?=: info: header Subject:)' |
         sed 's/^/^/;'
}} |
cat
bash $ cat /tmp/testfile
aaaa
bash $ cat /tmp/testfile | dgsh -x match1.dgsh  brrr@gmail.com
+ MITTENTE=brrr@gmail.com
+ shift
+ tee
+ cat
+ wait
+ grep -oP '[[:xdigit:]]+(?=: info: header Subject:)'
+ dgsh-conc -o 1
+ dgsh-wrap sed 's/^/^/;'
+ dgsh-conc -i 1
+ grep -P ' from=<\Qbrrr@gmail.com\E>'
+ grep -F 'info: header Subject: '
bash $ dgsh
dgsh $ cat /tmp/testfile | dgsh -x match1.dgsh  brrr@gmail.com
+ MITTENTE=brrr@gmail.com
+ shift
+ tee
+ dgsh-conc -o 1
+ cat
+ dgsh-conc -i 1
+ wait
+ grep -F 'info: header Subject: '
+ grep -P ' from=<\Q\E>'
+ grep -oP '[[:xdigit:]]+(?=: info: header Subject:)'
+ dgsh-wrap sed 's/^/^/;'
12416 dgsh: timeout for negotiation. Exit.
^CTerminated
dgsh $ dgsh -x match1.dgsh brrr@gmail.com < /tmp/testfile             
+ MITTENTE=brrr@gmail.com
+ shift
+ tee
+ dgsh-conc -o 1
+ grep -F 'info: header Subject: '
+ wait
+ cat
+ dgsh-conc -i 1
+ grep -P ' from=<\Q\E>'
+ dgsh-wrap sed 's/^/^/;'
+ grep -oP '[[:xdigit:]]+(?=: info: header Subject:)'
dgsh $

The fragment of code is nonsensical as it is. It's just a portion of a more complex code which works fine when invoked from bash and hangs when run from inside an interactive dgsh shell. As always I am available for further details. It looks like a problem with the pipeline. This time there was no residue of previous installations of dgsh.

Run with a freshly built dgsh on Ubuntu 14.04 .

dspinellis commented 6 years ago

Ouch! @mfragkoulis I think I know what's going on. When running within dgsh, wrapped commands are in the path and hell breaks loose. Correct?

mfragkoulis commented 6 years ago

@dspinellis This doesn't seem to be the problem this time, but we'd better check that there is no issue with wrapped commands when working inside dgsh. I'll come up with a proper example for this.

Regarding the issue at hand, dgsh forwards a pipe that connects the standard input of a dgshscript into the script's contents in order to form one negotiation graph. In the example that @trantor provided, the graph of commands that dgsh should form is the following.

cat /tmp/testfile |
tee |
{{
         grep -F 'info: header Subject: ' |
         grep -P ' from=<\Q'"${MITTENTE}"'\E>' |
         grep -oP '[[:xdigit:]]+(?=: info: header Subject:)' |
         sed 's/^/^/;'
}} |
cat

Without this arrangement, programs that combine inline commands with scripts wouldn't work. There would be multiple negotiation processes nested within one another. The nested ones, such as the negotiation among the commands in match1.dgsh, would have to wait for the outer ones to complete.

In the example at hand, the negotiation between cat /tmp/testfile | dgsh -x match1.dgsh brrr@gmail.com would happen first. At completion, cat /tmp/testfile would start its normal execution along with the unpacking of the match1.dgsh script. The negotiation among the commands of the script would happen then.

Let aside the dependencies that emerge between multiple negotiation processes, which are opposite to dgsh's rationale, this arrangement wouldn't work because cat /tmp/testfile will transmit data that is only garbage for the negotiation process among the commands of the match1.dgsh script and will break it.

The exact problem seems to be that dgsh naively connects the inherited pipe to the first command in the script, which is a mere assignment. I'll look into the specifics and post a fix hopefully.

@trantor If you can work around the assignment and shift, you can probably continue your work.

trantor commented 6 years ago

@mfragkoulis It's not really a problem for me, since I am not planning to run the pipeline from an interactive session of dgsh. To be honest I noticed by chance, but I thought it would be something useful to report. I didn't quite understand your explanation about the multiple negotiation processes, I'll have to admit.

A question for both of you. Here's a larger sample of my programme

#!/usr/local/bin/dgsh

MITTENTE=$1
shift

tee|
{{
    grep -F 'info: header Subject: ' |
    grep -P ' from=<\Q'"${MITTENTE}"'\E>' |
    grep -oP '[[:xdigit:]]+(?=: info: header Subject:)' |
    sed 's/^/^/;' &

    {{
        {{
            /opt/zimbra/bin/zmprov -l getAllDomains |
            sed 's/^/@/; s/$/#/;' &

            grep -oP ': [[:xdigit:]]+: to=<[^>]+' |
            sed -r 's/^: //; s/: to=</ /; s/$/#/;' |
            LC_ALL=C sort &
        }} |
        grep -F --matching-lines --file=- |
        sed "s/#$//;" &

        grep -oP ': [[:xdigit:]]+: message-id=<[^>]+' |
        sed -r 's/^: //; s/: message-id=</ /;' |
        LC_ALL=C sort &

    }} |
    LC_ALL=C join -j 1

}} |
grep -E --matching-lines --file=- |
sort

It goes on, but what comes later is beside the point, on top of complicating things transforming line data into a json structure which is then transformed as well, etc. etc. The bit with the /opt/zimbra/bin/zmprov command is basically something which is not meant to handle standard input but just to output something, which is different from the other pipeline at the same level. Actually the command itself could receive commands on its standard input, but obviously that's not the intended objective here. It works as expected (perhaps because I provided the actual command to /opt/zimbra/bin/zmprov as an argument so it didn't read anything from stdin, but I am uncertain about that) and I need it there to provide the pattern file input for the grep instance downstream, but I don't know how to have the command read stdin input from somewhere else (I tried something like < /dev/null but I got an error) or just tell dgsh that it shouldn't connect the standard input of the command with the upstream flow. Not sure about my terminology but I hope I got the meaning across.

dspinellis commented 6 years ago

To specify that a program takes no input you explicitly use dgsh-wrap -i 0 in front of it.

dspinellis commented 5 years ago

Consider trying MITTENTE=$1 </dev/null as a workaround. If this works, document it as a bug in the dgsh man page and close the issue.

mfragkoulis commented 5 years ago

MITTENTE=$1 </dev/null did not work, but I instructed bash to pipe input to a dgsh script only to the first pipeline within the script, not to standalone commands like MITTENTE=$1. Respectively for output. This arrangement is satisfying for most cases.

dspinellis commented 5 years ago

Hmmm. In /bin/sh and bash you can run tar cd - . | ( cd / tmp; tar xvf -)

Why was dgsh working differently? Is it now working like this?  Do we need to document something? 

-------- Original message --------
From: Marios Fragkoulis
Date:21/10/2018 01:14 (GMT+01:00)
To: dspinellis/dgsh
Cc: Diomidis Spinellis , Mention
Subject: Re: [dspinellis/dgsh] Timeout for negotiation when run interactively from dgsh (#103)
MITTENTE=$1
mfragkoulis commented 5 years ago

tar cf - . | ( cd /tmp; tar xvf -) has been running fine in dgsh since we wrapped various types of commands, e.g. subshell, that dgsh views as black box.

More specific explanation regarding input and output connections set up by dgsh: dgsh needs to set up input and output connections in a pipeline of commands starting with the first command on the left for the negotiation process to work. It may in fact turn out that some commands in the graph require no input or produce no output, but during the negotiation process those commands have both their input and output channel connected (except if they are located at the edges of the graph) in order to exchange messages.

I think that the present discussion is adequate documentation.

dspinellis commented 5 years ago

I'm worried about making dgsg more difficult to understand through special cases. If this behavior is special shouldn't we document it in the man page? Could we perhaps connect even non-pipelined commands during the negotiation phase?

mfragkoulis commented 5 years ago

It was a bug not a special case.