pkgjs / parseargs

Polyfill of `util.parseArgs()`
Apache License 2.0
121 stars 9 forks source link

fix(parser): handle negative numeric tokens #62

Closed bcoe closed 2 years ago

bcoe commented 2 years ago

Handle a bunch of finicky edge-cases around negative positionals, and negative numbers as arguments.

From yargs' test suite:

  it.only('should handle parsing negative #s', function () {
    const argv = parser([
      '-33', '-177', '33',
      '--n1', '-33',
      '-n', '-44',
      '--n2=-55',
      '--foo.bar', '-33',
      '-o=-55',
      // Note: one major change in this library from yargs-parser
      // is the dropping of greedy arrays and nargs.*/
      '--bounds', '-180', '--bounds=99', '--bounds', '-180', '--bounds', '90',
      '--other', '-99', '--other', '-220'
    ], {
      multiples: ['bounds', 'other'],
      withValue: ['n', 'n1', 'n2', 'foo.bar']
    })
    console.info(argv);
    argv._.should.deep.equal([-33, -177, 33])
    argv.n1.should.equal(-33)
    argv.n.should.equal(-44)
    argv.n2.should.equal(-55)
    // TODO implement dot properties:
    // argv.foo.bar.should.equal(-33)
    argv.o.should.equal(-55)
    argv.bounds.should.deep.equal([-180, 99, -180, 90])
    argv.other.should.deep.equal([-99, -220])
  })

Also address issue with parsing of -o=foo, for short opts.

TODO: add tests.

bcoe commented 2 years ago

Related: https://github.com/pkgjs/parseargs/issues/60

shadowspawn commented 2 years ago

I don't agree in principle with treating negative numbers as special tokens. In brief it clashes with using digits as short options, and adds an opinionated behaviour. Slightly longer version: https://github.com/pkgjs/parseargs/issues/60#issuecomment-1030929080

I'm personally 👎

But I expect casual users are more likely to encounter negative numbers than single digit options or other situations where it misfires. So we might decide to implement it anyway...

bcoe commented 2 years ago

I don't agree in principle with treating negative numbers as special tokens. In brief it clashes with using digits as short options, and adds an opinionated behaviour. Slightly longer version: https://github.com/pkgjs/parseargs/issues/60#issuecomment-1030929080

@shadowspawn without special casing negative numbers, it's hard to implement an example like a basic calculator application:

add -300 200

Becomes:

short options: -3 -0 -0, positional 200, which is pretty gross.

If you do want to implement the behavior -3, -0, -0, it seems easier to have it stored as-300` in positionals, and for the implementer to clean it up in post-processing (expand it as if it was three short options).


With regards to -p80, or -p1, I agree with @ljharb, that the 80 only be consumed if p is configured as accepting a value, that seems reasonable -- otherwise again you end up with the parse -p, -8, -0, and it's hard to know that the -8 and -0 bind to the p when working backwards.

shadowspawn commented 2 years ago

Short version: Yes, there is a trade-off.

Long version

without special casing negative numbers, it's hard to implement an example like a basic calculator application

Yes, that would be a use case that would not be supported out of the box.

Four approaches from a quick think:

If you do want to implement the behavior -3, -0, -0, it seems easier to have it stored as -300` in positionals, and for the implementer to clean it up in post-processing (expand it as if it was three short options).

(Tongue in cheek.) Or to rephrase, if you want to use digits as short options, do it yourself. Wait, why are we offering a parser if this is the easy approach?


As an aside. I think this is a subtle example of not always being feasible to build on top of a parser which has applied a number of decisions. Imagine processing this with negative numbers treated as positionals, and then try and post-process digits as options:

parseArgs(['-a -1 -b foo`], { withValues: '1' });

It isn't as simple as looking for the digits in the positionals and taking the next positional, the parser ate some options out of the middle.

shadowspawn commented 2 years ago

I'll add that there is somewhat less namespace pressure on short options these days, because we can use long-only options for the rare cases. For a user of our library, would still be a loss if a digit was the nicest semantic choice, but there would at least be alternatives available.

Note ls uses all lowercase letters and some uppercase letters and symbols and digits.

SYNOPSIS
     ls [-@ABCFGHILOPRSTUWabcdefghiklmnopqrstuvwxy1%,] [--color=when] [-D format] [file ...]
bcoe commented 2 years ago

what if we treat arguments wrapped in quotes as positionals, or values, and drop the quotes? Then an implementer could pre-process negative numbers by wrapping them in quotes.

shadowspawn commented 2 years ago

what if we treat arguments wrapped in quotes as positionals, or values, and drop the quotes? Then an implementer could pre-process negative numbers by wrapping them in quotes.

Hmmm, I don't like that as a work-around. Then quotes are special and get eaten by the parser. Note though, if the implementer wrapped the negative numbers in quotes the current parser would not treat them as options. It is "just" the implementer has to remove the quotes themselves after parsing.

shadowspawn commented 2 years ago

Treating negative numbers as positionals complicates the short option group processing, which could previously assume a blind expansion. Behaviour now broken:

% node zero.js -a1
{ flags: { a: true }, values: { a: undefined }, positionals: [ '-1' ] }

(For interest, Commander suffered from similar weird outcomes from blind expansion. To get it right I switched to processing just one short option at a time and no fall-through, go around big loop again to process anything left. Commander has flags and required values and optional values, but not negative numbers. https://github.com/tj/commander.js/blob/02a124c7d58dbae2ef11f9284b2c68ad94f6dc8b/lib/command.js#L1378)

shadowspawn commented 2 years ago

(Sorry @bcoe , lots of comments after you were running hot trying out the yarns-parser tests!)

bcoe commented 2 years ago

I would be curious to hear other people's thoughts, re: negative numbers, I can attempt the pre-processing approach if we would prefer not to support them.

Numerics in most cli apps I think are positive integers, delays, ports, etc. Are the common use cases for negative numerics I'm missing?

ljharb commented 2 years ago

Quoting props so the shell doesn’t interpret them - like globs - is standard. It seems perfectly fine to me that if i want to have a negative number as a value, my choices are to either use a long form with an = (--a=-1) or to quote the value (-a '-1').

shadowspawn commented 2 years ago

Quoting works for passing values that would otherwise be interpreted by shell, but does not work directly for protecting things that look like options. The shell removes the quotes before they are passed through.

% node -e 'console.log(process.argv)' hello '-a' '*'
[ '/usr/local/bin/node', 'hello', '-a', '*' ]
shadowspawn commented 2 years ago

For comparison, in Commander options that expect an argument take the next argument whether or not it starts with a dash. So --foo -123 works fine if option foo is expecting a value. i.e. get { foo: '-123' } in parsed options

There a trade-off, Commander gets an occasional question about why --foo --bar is not an error, when foo expects a value! i.e. get { foo: '--bar' } in parsed options

shadowspawn commented 2 years ago

Another implementation for interest. Python's arparse goes the extra mile, and allows negative numbers as positional and option-arguments iff there are no digit options defined. This is quite sophisticated. (argparse does not have greedy options, arguments with a starting dash are otherwise options and not option-arguments.)

https://docs.python.org/3/library/argparse.html#arguments-containing

If we went this approach we would have to decide which mode zero-config uses.

aaronccasanova commented 2 years ago

The more I look at the above examples and the referenced argparse Python module the more I lean towards an explicit configuration based approach for this behavior. The proposed options API in PR #63 (single configuration object) may also be a helpful design pattern for capturing negative numbers. e.g.

// node calc.js --add -300 200
const { values } = parseArgs({
  options: {
    add: {
      type: 'number',
      nargs: 2, // defaults to 1
    }
  }
})
// values: { add: [-300, 200] }
bcoe commented 2 years ago

I added nargs and greedy array support to yargs-parser, and regret it. I feel it creates too many opportunities for alternate interpretations of what a parse should like like when you provide additional options after the array argument.

I far prefer the explicit --arr 1 --arr 2 --arr 3.


I'd rather hold off on adding a number type for MVP, and try some sort of pre-processing in the yargs shim I've been working on.

shadowspawn commented 2 years ago

I said:

I don't agree in principle with treating negative numbers as special tokens. In brief it clashes with using digits as short options, and adds an opinionated behaviour.

I want to expand on that after thinking about it again in greedy/choosy context. I still don't like the idea of detecting negative numbers as such...

However, a different way of reasoning about this is that digits as options are rare, so we might choose not to recognise them. Then --foo -2 can be an option and its value because -2 is not an option (rather than because -2 is a number). And -789 can be a positional because -7 is not an option (rather than because -789 is a number).

This allows negative numbers to appear as option values or as positionals, which is what Python's argparse does when there are no digit options declared, and would allow a calculator per https://github.com/pkgjs/parseargs/pull/62#issuecomment-1030934736