sindresorhus / execa

Process execution for humans
MIT License
6.79k stars 214 forks source link

`parseCommandString()` splits at whitespace inside quotes #1135

Closed iiroj closed 3 months ago

iiroj commented 3 months ago

I tried to replace string-argv with Execa's builtin parseCommandString to reduce dependencies in lint-staged, but noticed that the method incorrectly splits arguments from all spaces, even those inside quotes. This is probably not the correct behavior, since quoted content should be passed as-is.

For example, adding this unit test to test/methods/command.js makes it visible:

test("parseCommandString() allows using quotes", (t) => {
    t.deepEqual(
        parseCommandString('node -e "setTimeout(() => void 0, 10000)"'),
        ['node', '-e', 'setTimeout(() => void 0, 10000)'],
    );
});

The test failure will be:

  parseCommandString() allows using quotes

  test/methods/command.js:351

   350: test("parseCommandString() allows using quotes", (t) => {                       
   351:   t.deepEqual(parseCommandString('node -e "setTimeout(() => void 0, 10000)"'), [
   352:     "node",                                                                     

  Difference (- actual, + expected):

    [
      'node',
      '-e',
  -   '"setTimeout(()',
  +   'setTimeout(() => void 0, 10000)',
  -   '=>',
  -   'void',
  -   '0,',
  -   '10000)"',
    ]
ehmicky commented 3 months ago

Hi @iiroj,

parseCommandString() escaping uses backslashes, not quotes.

From the documentation here and there:

Spaces are used as delimiters. They can be escaped with a backslash. await execa${parseCommandString('npm run task\\ with\\ space')};

Shell-specific syntax has no special meaning and does not need to be escaped:

  • Quotes: "value", 'value', $'value'

You have a few options to solve your case.

The recommended way is to use the template string syntax.

await execa`node -e ${'setTimeout(() => void 0, 10000)'}`;

If the subcommand is user-defined, then you can do:

const command = getCommand();
await execa`node -e ${command}`;

If the whole command string is user-defined (including node -e), then backslash escaping + parseCommandString() can be used.

execa`${parseCommandString('node -e setTimeout(()\\ =>\\ void\\ 0,\\ 100)')}`;

In other cases, you can use the shell: true option. However, we discourage this option.

await execa('node -e "setTimeout(() => void 0, 100)"', {shell: true});
iiroj commented 3 months ago

Yes, I've been thinking of deprecating shell support from lint-staged so that's not an option.

If this is the intended behavior, feel free to close this issue since I can just keep using string-argv to parse; the full command is indeed user-defined.

ehmicky commented 3 months ago

Yes, the core thing is: quotes are shell-specific syntax. The underlying system calls are done in C, so they do not use any quoting/escaping system, they just pass the command and its arguments as is. On the other hand, shells were designed for interactive terminals, where the input is a whole string, so they act as a middle layer that escapes/quotes through shell-specific characters like double quotes.

There are quite a few problems with shells, from lower performance (due to spawning an additional process) to security risks (command injection) and cross-platform support (not every Windows user has Bash), so Execa discourages it. We try to find ways out of using shells: the template string syntax, and parseCommandString().

parseCommandString() uses only backslashes for escaping instead of quotes. Quotes have some pros: when many characters must be escaped, only 2 quotes are needed for all of them, instead of one backslash per character. Also, users are more familiar with them.

But quotes have some cons as well:

So when it comes to lint-staged, IMHO I would recommend the following: