Open bar-g opened 9 months ago
Maybe the stand-alone semicolon (;)
could even be made unnecessary, by parsing/filtering out all named args first no matter where they are positioned?
The remaining args are then words, typed, expressions or blocks, all in their own order.
That could even allow for a more common command argument ordering and flexibility, for example:
myproc [count > threshold] word1 (->y) word2 (->z)
Spreads could be passed like this:
myproc word (positional_list;) # as positional spreads are also
# defined before the semicolon
myproc word (;named_dict)` # named spread, as defined (last) after semicolon
myproc word (;named_dict) (positional_list;) # also both in arbitrary order
So all arg types can be used together, ordered as it is most practical, with named args addable anywhere:
myproc (debug=true) word (positional_list;) (->out) (;config_dict_from_hay)
Hm, also just like with the named-args (...=...)
, multiple block-literals {...}
could also be identified and filtered out in a first step.
Which could allow to pass multiple block-literals {...}
on the command line, allow for keyword/command binding to blocks either from the left or right side, and also to make use of --
to distingush options for sure.
And finally, some small further additions to this could even turn the default option parsing into a trivial hit of a jackpot:
EDIT:
myproc (name1=expr) (name2='sting') (flag=true) -- # named-args
myproc --name1=(expr) --name2='string' --flag -- # simple synonyms
I don't see what the motivation for this is -- why does it compose better than the existing syntax?
Also (;)
is a bit ugly
What? Maybe think a bit about myproc "$@"
.
The (;)
was dealt with in the first comments topic.
Just noticed the general case for myproc --name1=(expr)
would need to be myproc --name1="( expr )"
(double quoted), to allow spaces of the expression in the word.
For example, arguments can't be passed on. And it seems impossible with the single typed vector, it does not support and extend the command line format standard. (But word based typed arguments (as of this issue) could be passed on and even directly parsed like flags and options mapping to typed named-options in ysh.)
ARGV compatibility and composability:
p() { echo p: \$@="$@"; q "$@"; }
proc q { echo q: \@ARGV= @ARGV; = ARGV; test -z ${has+run} || { r @ARGV; return 0 }; setglobal has = 'run'; p @ARGV }
proc r { echo r: \@ARGV= @ARGV; = ARGV }
echo
echo q 1 2 3
q 1 2 3
unset has
echo
echo "q ('1', '2', '3')"
q ('1', '2', '3')
echo
q 1 2 3
q: @ARGV= 1 2 3
(List 0x7efe1d446208) ["1","2","3"]
p: $@=1 2 3
q: @ARGV= 1 2 3
(List 0x7efe1d446d88) ["1","2","3"]
r: @ARGV= 1 2 3
(List 0x7efe1d457508) ["1","2","3"]
q ('1', '2', '3')
q: @ARGV=
(List 0x7efe1d4c3ac8) []
p: $@=
q: @ARGV=
(List 0x7efe1d457848) []
r: @ARGV=
(List 0x7efe1d457848) []
ysh ysh-0.20.0$
For the second run, it could be something like:
q ('1', '2', '3') # also q ('1') ('2') ('3')
q: @ARGV=('1') ('2') ('3') # or collapsed to '1' '2' '3' for strings?
(List 0x7efe1d4c3ac8) [('1'), ('2'), ('3')] # or collapsed to ['1', '2', '3'] for strings?
p: $@=('1') ('2') ('3') # or collapsed to 1 2 3 for strings?
q: @ARGV=('1') ('2') ('3') # or collapsed to '1' '2' '3' for strings?
(List 0x7efe1d457848) [('1'), ('2'), ('3')] # or collapsed to ['1', '2', '3'] for strings?
r: @ARGV=('1') ('2') ('3') # or collapsed to '1' '2' '3' for strings?
(List 0x7efe1d457848) [('1'), ('2'), ('3')] # or collapsed to ['1', '2', '3'] for strings?
Updated the example, showing @ARGV
vs. = ARGV
behavior.
The intention is that you should able to splat like
p @words (...typed; ...named; block)
However I found a bug related to that while testing just now
I think the problem there is that I wanted cd (myblock)
and not cd (; ; myblock)
which looks kinda ugly
But then I introduced a rule that doesn't compose
So there is something to fix here
Note that "$@"
is deprecated in YSH
The replacement is @ARGV
(which might change slightly
However @strs
means "stringify and splice" right now. It is part of the language of strings
The syntax myproc (...typed)
is for typed args. It is the typed equivalent of @strs
, and must appear within parens
in other words, @strs
is part of the shell command/word language, while (...typed)
is part of the Python-like expression language
p @words (...typed; ...named; block)
But these only iterate over specific argument types.
Whereas in the command interface language "$@" iterates over all flags, options, and arguments, i.e. words, and may also be called argument vector.
I notice, that you didn't use @ARGV
in the example above, and I think that is correct and important.
I'd say "$@"
and @ARGV
are on the shell/command language side of things.
And two important properties of the shell command language are:
It's nice if "programming-style" syntax can also be used in ysh. However, as ysh still is a shell on the top-level, I think it should also keep the typical command language properties and extend that to allow typed args. So, not make the shell prompt feel (too, much, 'like', working, 'with a', "programming $language").
A short example might be:
proc p ( input1, input2 ; ->out1, ->out2)
p $a $b ( ->lane1, ->lane2 ) # "programming-style" well, quite sub-optimal for the general case
p $a (->lane1) $b (->lane2) # "shell-command-style" (synonym)
# * an `@ARGV` with arbitrary type-intermingled permutation is possible
# * while parsed args are mapped into corresponding per-type "positionals" (typed arg lists).
# * named-args may be added in arbitrary order and any position, e.g. (debug=true)
# * all leading named-args up to '--' may even be defined using known flag and opt syntax
Possible wording?
# positional arg lists
@ARGV # naming (strings) of all args, incl. the below (also -x --flags and --options already parsed into NAMEDV )
WORDV
TYPEDV # individually passed plus splat (but not those going into the type-specific positionals)
NAMEDV # individually passed plus splat
BLOCKV # passed as typed plus literals, or separate those?
From the examples in this issue it seems it's possible to have a really good mapping between "$@" / @ARGV and the type specific proc/func positional arg lists. (And even to nicely collapse string-typed args back into shell strings, e.g. to call external commands.)
And I think this solution would also work naturally for
The idea in short:
"$@"
.)@ARGV
which may be parsed from and @WORDV
, TYPEDV
, NAMEDV
, and BLOCKV
.)From the examples in this issue it seems it's possible to have a really good mapping between "$@" / @argv and the type specific proc/func positional arg lists. (And even to nicely collapse string-typed args back into shell strings, e.g. to call external commands.)
As mentioned on the PR, I understand why this is an appealing idea, and other people have tried it before
But I don't think it's a good idea in general. You always need a little bit of code to bridge the gap.
In particular, this style will lead to bad error messages. Users should be writing their own error messages for CLIs, not relying on YSH/Oils
Designing a CLI takes a little bit of effort -- it's not something you can do just by writing a signature in YSH
I agree, about designing real CLIs needing refinements etc.
The idea in this direction so far was to only do the default flag and opt parsing for internal proc calls and when using the runproc
style for quick scripting.
But now that you mention possible drawbacks, maybe it's a good idea to create a separate dict for the auto-parsed flags and opts, so they can never get in the way in the named-args dict.
Note, I think that auto-parsed dicts are only one half of flag/opts handling, the other half is the custom way of iterating over them, and that is also the code which I would expect would be able to generate the best custom error messages if flags/opts are missing/wrong or inconsistent.
I tested how ysh composes by implementing the code for blog https://www.oilshell.org/blog/2017/01/13.html and it appears as still only "against grain" in ysh.
Try it out from: https://gist.github.com/bar-g/e9e8e19f9368bf02a0f92cc5752be435
What do you think about @ARGV
needing to contain the entire command line and allowing multiple typed args in separate parens?
@PossiblyAShrub See, if capable flag/opt parsing code is included, I'd hope it could also be put to good use for simplified scripting as an out-of-the-box ysh feature.
Thanks for writing the Forth compose tests
But I think this is a fundamental interior-exterior problem, and it's not possible to solve automatically or in general
If you want exterior composition, then you use flags and args. And manually write any conversion to typed data
If you want interior typed data, then you can you can just pass it around to funcs AND procs, without parsing
I made a distinction between procs as exterior and funcs as interior here
https://www.oilshell.org/release/0.19.0/doc/proc-func.html#at-a-glance
Perhaps we should also elaborate that procs can also take typed args, but those typed args are interior
i.e. there is no "auto-parsing"
If you want to use the Forth-like style, then you use strings / words only. Because we can't change the kernel interface -- we can't change char** argv[]
and sys.argv
so forth.
Python and C will never accept typed args -- you always have to deserialize from strings.
I think that is the killer argument -- anything we do in YSH is not going to affect Python or C.
It's not actually the number 1 goal to make doing everything in YSH as convenient as possible. (Though there is some of that, I just got some feedback on using Hay from YSH in the interior style on Zulip)
The shell still remains for polyglot composition. And other languages only have string argv
.
To be honest, it might even make some sense to have three keywords, like this
func f (typed1, typed) {
}
exterior-proc myproc (string1, string2) {
}
interior-proc myproc (string1, string2; typed1, typed2) {
}
So that would emphasize that when you use typed args, you're limiting yourself to the interior. That may not be obvious to users.
So it could be
Naming idea - is proc
vs. typed proc
proc p (str1, str2, ...rest) {
}
proc p (str1; typed1) { # ILLEGAL because it's not a typed proc
}
typed proc p (str1; typed1) { # now it's OK
}
This is basically for learning/teaching, so we can say:
Then we don't need any caveats
It's nice to have a separation of interior and exterior. Right now proc
is a mix of both
I made a note about typed proc vs. proc here - https://oilshell.zulipchat.com/#narrow/stream/384942-language-design/topic/typed.20proc.20vs.2E.20proc.3F
Not sure if we will do that, but it is one simple way to make things more explainable, make the interior/exterior distinction clear
That is probably one of the most important concepts in the language, and in shell programming
I think this can also clarify our advice
Right now our advice is - https://www.oilshell.org/release/0.19.0/doc/proc-func.html#tip-start-simple
You can start with just a list of plain commands:
Then copy those into procs as the script gets bigger:
Then add funcs if you need pure computation:
I think our advice can be
Typed procs are actually not really for users!!! You can do everything you want with JSON/J8 and plain procs.
JSON is Dicts and Lists that you copy -- you very rarely need mutable dicts in shell-style programming.
The reason we want typed procs is to implement the 16 use cases
However that is more of an Oils dev thing than an end user thing.
Once we have settled on the metaprogramming techniques that can implement those use cases, than users can also use them
The simplest thing is that we wanted cd /tmp { echo $PWD }
, and users can now implement that too
Also
Hmm I think this is pretty good ...
Ok, now I found your typed proc
proposal.
I'll need some time to read up deeper.
But just from glancing I think I get interior vs. exterior, but not why extra syntax here, don't see a problem it solves because there are clear errors when tying to call external commands with typed args. And don't see why should the exterior limit or only allow worse interior composing.
Piping and JSON passing is an important, but just one, type of shell-style programming. Good uses for forth-like composing I think are for example things like the the repeat, timeout, or debug function (latter is an internally broken shell function in the tests), and uses in modules for trivial passing of (internal) commands on to sub-modules.
Let's see if I can find sense to reduce the case to these internal composing of things rather than parsing.
Let me get this a bit straight. It feels like various topics all at once. What we're talking about:
myproc (opt=15)
) vs nontyped myproc 15
-> relevant in regards that nontyped already have ARGV. the proposal is only about typed args?myproc --opt=one --flag
and myproc (opt="one"
, flag=false) - which inherently requires typed arguments aswell!
myproc (opt={a: 5})
and myproc --opt="{a: 5 }"
:Dproc foo (;bar: dict = {})
Yeah there are a bunch of topics in here, it might be better to start a new thread
The only thing I'm proposing is that there be 2 different "worlds"
proc
, with string args. Goes with flag parsing.typed proc
, with typed args. No flag parsing!There is no automatic conversion or serialization. CLIs take some effort to design; it can't be done automatically by YSH
You have to write help, and write good error messages yourself
So then the plain procs still compose in the Forth-like style.
The typed procs are more like Python, with mytyped @strs (...typed; ...named)
While I didn't understand all of this proposal, it feels like Perl-ish "magic". It doesn't feel like YSH
There are going to be a lot of corner cases; it would be a huge rewrite; and I don't like the names :-/
typed proc
is a very tiny tweak to what we have
Though note there is still a bug with blocks
The idea in short:
(The command is provided as, and processed from:
"$@"
.)@ARGV
whichcontainsnames differently typed args.(Internally, the calling command is also provided as already interpreted and type-sorted arg lists
@WORDV
,TYPEDV
,NAMEDV
, andBLOCKV
.) [EDIT: maybe better have separate dict for flags/opts, instead of adding to NAMEDV]This allows for immediate action, e.g.
Explanation:
If the command mode (string based) could accept referencing multiple typed args in separate parens and brackets, there could simply be
Together with the CLI flags and options these arguments could be mapped automatically into positional arg lists.
I think there may be three different general ways to organize positional arg lists, and all have their merits and valid use cases:
So args are readily available within the called proc, to be consumed (operated on), adjusted and possibly passed on to other procs.
With this, procs can access, iterate over, and pass-on, typed args with the same ease as string args. (Actually, much better because NAMEDV [EDIT: or better a separate dict?] already contains the parsed CLI flags and options.)
Extra credit: CLI flags, and options map very well to typed named-args (
NAMEDV
[EDIT: separateOPTSV
? ]):All this automatically, by default (based on the ysh proc definitions, without requiring any customized argparse specification).
How it evolved:
In next comments: Get rid of the unnecessary semicolon arg
(;)
etc.PS: I was originally wondering about what you meant when you wrote about having "Lazy Arg Lists", and found in the doc it's actually (much better) named (lazy-)expr-arg (https://www.oilshell.org/release/0.20.0/doc/ref/chap-cmd-lang.html#YSH-Simple)
After reading it, though, I felt there may even be much more potential to simplify ysh-simple, and to improve on the ability to compose.