GrammaticalFramework / gf-core

Grammatical Framework core: compiler, shell & runtimes
https://www.grammaticalframework.org
Other
131 stars 35 forks source link

Add `-lines` option to `system_pipe`. #97

Closed WolframKahl closed 3 years ago

WolframKahl commented 3 years ago

Without that, the example in gf-shell-reference.html

   gt | l | ? wc

always returns a zero (!) line count:

G1> gt | l | ? wc
      0    1157   10861

63 msec
G1> gt | l | sp -command="wc"
      0    1157   10861

70 msec
G1> gt | l | sp -lines -command="wc"
    294    1157   10862
WolframKahl commented 3 years ago

My motivation for this change is a long gf-shell pipeline where system_pipe is used to deal with minor lexing, unlexing, and treebank formatting issues via sed (for now), and all the surrounding commands operate line-wise:

rf -lines -file=...  | ps -lines -lexcode | sp -lines -command="sed '...'"  | p -lang=... | l -unlexer="...=unlexcode" -tabtreebank | sp -lines -command="sed '...'" | wf -file=...

Since UN*X commands like sed also operate line-wise, I was quite surprised that this did not work with the original version of system_pipe...

inariksit commented 3 years ago

It seems to be the l step that messes up the line breaks. This is what I get:

Languages: LangEng
Lang> gr -number=10 | ? wc 
       9     107     999

Lang> gr -number=10 | l | ? wc 
       0      81     413

Any idea @aarneranta @krangelov @johnjcamilleri what this is about?

johnjcamilleri commented 3 years ago
Lang> gr -number=10 | ? wc 
       9     107     999

Hmm, but that 9 should really be a 10, shouldn't it? Another bad example:

Foods> gr | ? wc -l
       0

@WolframKahl's change doesn't completely fix this either:

Foods> gr -number=1 | sp -lines -command="wc -l"
       1

Foods> gr -number=2 | sp -lines -command="wc -l"
       3
johnjcamilleri commented 3 years ago
  1. I imagine this is just something to do with trailing newlines
  2. Should @WolframKahl's addition be the default behaviour? Having to type sp -lines -command=... is a lot more tedious than just ?
WolframKahl commented 3 years ago
2. Should @WolframKahl's addition be the default behaviour?

I'd be in favour of that (it's what I would have expected), but I don't know where the current behaviour may actually be relied upon. And an -unlines option would be “the odd one out”...

inariksit commented 3 years ago

I vote for default behaviour too. I don't think this is anything people rely on -- whenever I pipe stuff in GF shell, I've just included tr ' ' '\n' there for as long as I remember.

johnjcamilleri commented 3 years ago

Agreed. In particular even the help message for sp contains an example that doesn't work properly:

> h sp
sp, system_pipe
send argument to a system command

syntax:
  sp -command="SYSTEMCOMMAND", alt. ? SYSTEMCOMMAND

flags:
 -command   the system command applied to the argument

examples:
  gt | l | ? wc --generate trees, linearize, and count words

So we can make this behaviour the default. But I think this line count issue (mentioned above) should also be fixed before we merge.

inariksit commented 3 years ago

FYI: @1Regina and @Meowyam are working on making this feature default + fixing the remaining issues with line breaks.

inariksit commented 3 years ago

Fixed via #121 — no flag needed, line separation is now the default option.