Open Mike111177 opened 4 years ago
These are just a couple of thoughts. Other potential things that could come from this is some of the functionality of yargs being possible in yargs parser, maybe in the future the idea of commands.
@Mike111177 I appreciate you taking the time to write this up đ
The goal of the yargs-parser library is to be a (relatively) lightweight library, that given process.argv
provides are parsed version, which a higher level library like yargs
or meow
(with both use the library) can build on top of.
I think if we were to extend too much on the API surface like this, we start to blur the line too much between what the goals are of yargs, vs., the goals of yargs-parser.
I think a good way to think of it might be the responsibilities of Node.js
, vs., v8
.
Hmm, I kinda regret not including #300 in this issue, because this is kind of a response to both. I'll be quoting stuff from there too, I don't know if it is possible to merge issues.
@bcoe I totally agree with keeping the functionality of yargs, and yargs-parser separate and we'll defined. My concern is that the current responsibilities of yargs-parser is already pretty muddy. It serves as two roles currently:
Because most shells differ in behavior from yargs-parser tokenizer this can cause unexpected results from the user.
In an ideal world yargs parser would be split into two packages, something like yargs-tokenizer, and yargs-interpreter. This would make it so the yargs package doesn't need to include the yargs-tokenizer package, making yargs more lightweight. It would only need to include yargs-interpreter, which would be what yargs-parser is sans tokenizer. Obviously this would break backwards compatibility. But you could also for legacy support have yargs parser just include both the interpreter and the tokenizer. You would have a dependency tree that looks like this (including meow):
I'm a proponent of decomposing libraries, but have definitely gotten some flack for yargs' footprint.
Honestly, in my opinion, it requires no extra effort to install a package with dependencies, especially if you are also the maintainer of that dependency. The only real argument I can think of against it is that maybe if you are using some preemptable cloud VM thing where small code footprint is critical for speed of download during initialization. And if that is the problem then why wouldn't you be webpacking your app anyway to minimize the size?
I think that with this method, you can both reduce the logic complexity and code footprint of yargs, as well as expand the functionality of yargs-parser, giving it a stricter conformance and improve user expectations.
Ultimately, in order to maximize the flexibility and user control while minimizing code footprint (and maintaining backwards compat) I think you need a 3 stage pipeline in which any component can be swapped out.
@Mike111177 I appreciate these detailed posts đ
To manage expectations, a complete overhaul of the yargs-parser
library is not work that this project has the cycles to take on right now (or to help someone take on):
Where I would rather start would to be to see if we could make the existing tokenizer a bit smarter, to address some of the bugs you bring up in #300:
Perhaps it will become clear that we can move some logic from the parser into the tokenizer, through looking at patterns in the bugs.
I would also be interested in seeing if we could simplify the logic in yargs-parser.ts
, e.g., could we make the handling of arrays, dot notation, etc., handled in a more abstract way.
If I see a few incremental wins, that make the library more malleable, I'm certainly going to be more supportive of larger refactors in the future.
@Mike111177 I'm not trying to discourage you, and would appreciate and support your contributions đ
I would just love for us to start with some initial wins, perhaps we can get a test suite in place for some other shells (using GitHub actions), this would be a great starting point.
Ok, I think what I will do then is more precisely document each tokenizer (That being yargs and the various shells). This way at a minimum we can try to enforce the yargs tokenizer can emulate all of the common behavior of the four most common shells (bash, cmd, powers hell, and sh). This should also allow us to determine better what is the responsibility of the tokenizer vs the parser. Currently I am not totally certain on how to do the tests against shells it in a CI friendly manner (See #300), but with more precise documentation on the differences between existing tokenizers, we can make more precise tests.
Once that is done, we can determine the token style varients and generalize syntax for the parser.
@Mike111177 that sounds like a great start to me.
@bcoe should I copy #300 into here and close #300? These two issues are a lot more coupled than I originally realized.
@bcoe should I copy #300 into here and close #300? These two issues are a lot more coupled than I originally realized.
Yes, let's combine the conversations.
Currently the only way to configure yargs-parser is by the following. (Excuse my mixed syntax)
I propose creating a builder pipeline that allows you to preconfigure your opts before running the command by adding an opts function to the Parser interface
This has a bunch of advantages:
Allows programmer to not keep track of opts variable when used multiple times.
Allows parser to check options ahead of time. This means that if after you create your parser with opts, yargs-parser doesn't have to do any of the setup of validation twice. So if you do
myParser("mystring")
there is no need to any of the options pre-processing twice. The first time the args are even referenced on the parser function are like 150 lines of code in: https://github.com/yargs/yargs-parser/blob/5f987aae9c7e7781e54c008b8cff01b210a7a19a/lib/yargs-parser.ts#L203 Since this is not a breaking change you can also now throw an exception on checkConfiguration() errors, since this is a new opt-in feature anyway.Really good interop with ideas from #300 if considered:
//To make a preconfigured bashlike parser const parse = require('yargs-parser').opts({...}).bash //Returns (arg: string, opts?: Partial) =>Arguments
//Can immediately throw if your options are invalid, faster debugging
parse("myweird option with a literal \\" in it") //Is able to skip option validation since it was already done.
parse("myweird option with a literal \\" in it", {additional_opts}) //Can't skip validation, but original options will still have thrown error if malformed
parse(["my", "bunch", "of", "tokens"]) //Compiler error: bash tokenizer expects string, tokenizing tokens makes no sense