Open cdhowie opened 5 years ago
@cdhowie your reasoning seems correct regarding some of these edge-cases.
But I think some of these issues would exist in yargs-parser
but not yargs
. yargs is passed arguments after they've already been processed by the operating system's shell.
As an example:
yargs-app -c "foo and bar"
☝️ by the time yargs gets process.argv
for processing, a shell like bash
will have stripped the quotes. The same goes for if we were to try to process \
.
I wouldn't be at all shocked if there are still some edge-cases around quotes, but I'd test the behavior with an integration test through bash
, not with strings passed into yargs-parser
(since some of these edge-cases would be in practice impossible with bash
).
If we do find some edge-cases around quote handling still, would very much appreciate help addressing them.
What is the recommended way to deal with non-shell use-cases then? For example, I am using this as the basis for a chat bot with multiple commands, sub-commands, etc. Yargs is a great tool for this use case.
However, I can't just toss a string at yargs and have it do the right thing because it doesn't strip quotes. I had to write my own JavaScript library that applies bash-style quoting rules to the string and returns an array. Yargs can correctly process the array, because I've already taken care of consuming quotes.
This feels like something that yargs-parser should be able to do when given a string (but not an array, of course).
FWIW, here are my library's test cases. I would expect yargs-parser to apply the same rules to a single string argument.
function test(line, args) {
it(`parses: [${line}]`, function () {
expect(parse(line)).toEqual(args);
});
}
function testFail(line) {
it(`throws while parsing: ${line}`, function () {
expect(() => parse(line)).toThrow();
});
}
test('ab cd ef', ['ab', 'cd', 'ef']);
test('ab cd ef ', ['ab', 'cd', 'ef']);
test(' ab cd ef ', ['ab', 'cd', 'ef']);
test('ab "cd ef"', ['ab', 'cd ef']);
test('ab cd\\ ef', ['ab', 'cd ef']);
test('ab\\ncd', ['abncd']);
test('a "doin\' a test" b', ['a', "doin' a test", 'b']);
test('a "" b ""', ['a', '', 'b', '']);
test("'a'\\''b'", ["a'b"]);
test("'foo '\" bar\"", ['foo bar']);
test('', []);
testFail('ab\\');
testFail('a "bc');
testFail("a 'bc");
@cdhowie I wonder if it would be worth adding an additional function to yargs-parser
, that performs the same pre-processing that bash
would apply, then folks using it for a chat bot could just pass the string through this first, but we wouldn't have a push and pull between yargs features and yargs-parser features.
That seems a reasonable compromise. If you would like, I can supply the code that I've written to accomplish this.
@cdhowie this sounds reasonable to me, perhaps we could call the method something like, toArgvFormat
, which would take a string --foo "bar hello" "testing \" this"
, and return an array of the format:
["--foo", "bar hello", "testing \" this"]
👆 this could then be passed to yargs-parser.
I have published the module that I wrote for this kind of parsing: https://www.npmjs.com/package/shell-parser
Feel free to pull this as a dependency or just take the code from index.js
if you don't want another dep.
I have the case with a command line like
./iobroker state set javascript.0.testString "Hello World"
Where the "hello world" is cutted into two params with 14.0.0 ... Why you not incorporate that shellparser-approach?
DO we have any chance to get that fixed because it is really anoying :-(
@Apollon77 could you provide a reproduction in yargs? most shells strip quotes before populating them in process.argv
, see:
example.js
console.info(process.argv);
Will output:
bencoe$ node example.js "hello world"
[
'/Users/bencoe/.nvm/versions/node/v12.15.0/bin/node',
'/Users/bencoe/oss/test-release-please-standalone/example.js',
'hello world'
]
Yargs has the same behavior:
example2.js
:
const yargs = require("yargs")
console.info(yargs.parse());
Will output the following in bash:
bencoe$ node example.js "hello world"
{ _: [ 'hello world' ], '$0': 'example.js' }
Edit: I missed the context of who you were replying to. Nonetheless, I feel like this is a better summary of the issue than what I originally wrote so I will leave it here as better documentation of my rationale for requesting this change.
@bcoe Indeed, I would not expect yargs to strip anything out when processing an array. However, this issue is about processing a string, not an array:
> const yargs = require('yargs');
undefined
> yargs.parse('a "b c"');
{ _: [ 'a', '"b c"' ], '$0': '' }
Here, I would expect the return value to be { _: [ 'a', 'b c' ], '$0': '' }
.
The parser grouped b c
into a single argument, but did not strip the quotes. This is very unusual, at least to me -- clearly it understood that they were quoted and therefore meant to represent a single argument, but the quotes were left there.
When invoking this API, I expected one of two results:
['a', 'b c']
: The parser saw the quotes, understood that the intended meaning was to provide a second argument with a space in it, and consumed the quotes during parsing.['a', '"b', 'c"']
: The parser does not understand quotes and just split on whitespace.Either of these approaches are fine, IMO...
But the actual result ['a', '"b c"']
makes absolutely no sense to me and I can't imagine a scenario where it would be expected. It seems to me the quotes should be understood and removed appropriately, or they shouldn't even be recognized.
Instead, we have this case where they are kind of half-processed. When I saw this result, I had to double-check that the input I was feeding yargs was in fact a "b c"
and not something like a '"b c"'
. My POLA-violation detector was alarming loudly.
Now my code that actually uses the arguments has to account for multiple scenarios, and I can't be sure which actually happened:
And this all assumes that I tell downstream code where I got the arguments from, which shouldn't matter in the first place.
My workaround is to never give yargs a string, because I can't trust that the output makes sense, and it severely complicates the rest of my code. Instead, I wrote my own routine to do POSIX-shell-like parsing of a string into an array, which is then safe to give to yargs.
@cdhowie I just hit this issue using yargs for a discord bot actually. I would love to get baseline support for this into yargs. That being said I want to leverage your library to get things working now.
After I execute a shellParser(myArgsParam) I get an array. The yargs.parse method wants a string. Is there something I am misunderstanding on how to use your library with Yargs?
const yargs = require('yargs');
const shellParser = require('shell-parser');
const parser = yargs
.scriptName(config.prefix)
.usage('$0 [command] [options]')
.commandDir('commands')
.demandCommand(1)
.help(false)
.showHelpOnFail(false)
.showHelpOnFail(true)
.version(false);
parser.parse(shellParser(originalArgs), async (err, argv, output) => {
if (argv.handled) {
const result = await argv.handled;
await message.channel.send(result).catch(console.error);
} else {
await message.channel.send(output);
}
});
I apologize if I made any mistakes in terminology Javascript is not my primary language.
As a follow up I was able to get the desired behavior with the "greedy-array" configuration set and changing my variable to an array type and then joining by space.
.option('g', {
alias: 'guild',
demandOption: true,
describe: 'Specify the guild your character is currently in.',
type: 'array'
})
@Mandalorian007 My use case is also a Discord bot. :)
After I execute a shellParser(myArgsParam) I get an array. The yargs.parse method wants a string. Is there something I am misunderstanding on how to use your library with Yargs?
yargs.parse()
wants either a string or an array, so the code in your first question should work just fine -- if it doesn't, what problem are you having?
See the docs (emphasis mine):
Parse
args
instead ofprocess.argv
. Returns theargv
object.args
may either be a pre-processed argv array, or a raw argument string.
Version: 13.1.0
See also #93 and #138 which both apparently conflict in their expectations of what should happen to quotes.
When I run
parse('a "b c"')
I expect the quotes to be consumed, as happens with most proper shells, but this is not what happens:This expectation appears to be different for different people and I'm not sure if this actually is a bug, but it's not at all what I expected it happen. Is it possible that we could get an option for how quotes are to be treated?
Here's what I expect to happen based on the behavior of sane shells:
a "b c"
becomes['a', 'b c']
a b\ c
becomes['a', 'b c']
a '"b c"'
becomes['a', '"b c"']
a \"b c\"
becomes['a', '"b', 'c"']
yargs-parser does something different in all cases. The last one in particular is most surprising.
There doesn't seem to be any way to get a literal space into an argument without quotes. If yargs leaves the quotes in, then suddenly it's the responsibility of my application to understand that quotes might be present and strip them. This means my application is doing the work of the parser, and that feels very, very wrong -- otherwise, why am I using yargs?