DaanDeMeyer / reproc

A cross-platform (C99/C++11) process library
MIT License
552 stars 65 forks source link

Under Windows: Like conhost.exe, argv is not properly escaped when building commandline #18

Closed fcharlie closed 5 years ago

fcharlie commented 5 years ago

Reproc is a great project, Looking at the recent reproc source code, I found that there seems to be a problem in the source code. When converting argv to commandline, there is no escaping operation. The relevant code is as follows: https://github.com/DaanDeMeyer/reproc/blob/aba3ce528ff48c4ce84365d2c395b9f0a7c1824d/reproc/src/windows/reproc.c#L60-L62

Looking through strings_join I didn't find any escape code: https://github.com/DaanDeMeyer/reproc/blob/aba3ce528ff48c4ce84365d2c395b9f0a7c1824d/reproc/src/windows/string_utils.c#L8-L54

If any of the strings in argv have spaces, quotation marks, and single quotation marks, the process that is started may not be the one you want (there are spaces in the program path), or the process argv that was started does not match the expected one.

This problem is the same as the WindowsTermainl: #1090 I reported to Windows Terminal.

The fix can refer to PR: WindowsTerminal: Fix The conhost command line is not properly escaped

DaanDeMeyer commented 5 years ago

Thanks for the detailed issue report!

I'll be sure to take a look when I have some time. Do you by any chance have a command with args that reproduces the problem for reproc (so I can check before/after after applying the fix)? The reproc examples include a forward binary that can be used to run any program with args through reproc which can be used to run any command line with reproc.

fcharlie commented 5 years ago

@DaanDeMeyer

You can start C:\Program Files\Powershell\6\pwsh.exe to try it out, or you can use reproc to start the source code for the following program to see if the expectations are consistent.

#include <cstdio>

int main(int argc, char **argv){
 for(int i=0;i<argc;i++){
  fprintf(stderr,"arg%d: %s\n",i,argv[i]);
 }
return 0;
}
fcharlie commented 5 years ago

@DaanDeMeyer Here is a C implementation that has not been tested enough:

////
#include <stdio.h>
#include <stdlib.h>
#include <assert.h>
#include <stdbool.h>
#include <string.h>

size_t escape_length(char *s, size_t *rlen, bool *hp) {
  size_t len = 0;
  size_t i = 0;
  bool hasspace = false;
  len = strlen(s);
  if (len == 0) {
    // ""
    return 2;
  }
  size_t n = len;
  for (; i < len; i++) {
    switch (s[i]) {
    case '"':
    case '\\':
      n++;
      break;
    case ' ':
    case '\t':
      hasspace = true;
      break;
    default:
      break;
    }
  }
  *rlen = len;
  if (hasspace) {
    n += 2;
  }
  *hp = hasspace;
  return n;
}

bool argv_escape_join(const char *const *string_array, int array_length,
                      char **result) {
  assert(string_array);
  assert(array_length >= 0);
  assert(result);
  // Determine the length of the concatenated string first.
  size_t string_length = 1; // Count the null terminator.

  for (int i = 0; i < array_length; i++) {
    size_t rlen = 0; // NOT use
    bool hp = false; // NOT use
    string_length += escape_length(string_array[i], &rlen, &hp);
    if (i < array_length - 1) {
      string_length++; // Count whitespace.
    }
  }

  char *string_out = malloc(sizeof(char) * string_length);
  if (string_out == NULL) {
    return false;
  }

  char *current = string_out;
  for (int i = 0; i < array_length; i++) {
    size_t rlen = 0;
    bool hp = false;
    size_t part_length = escape_length(string_array[i], &rlen, &hp);
    if (rlen == 0) {
      strcpy(current, "\"\"");
      current += 2;
      if (i < array_length - 1) {
        *current = ' ';
        current += 1;
      }
      continue;
    }

    if (rlen == part_length) {
      // NOLINTNEXTLINE(clang-analyzer-security.insecureAPI.strcpy)
      strcpy(current, string_array[i]);
      current += part_length;
      // We add a space after every part of the string except for the last one.
      if (i < array_length - 1) {
        *current = ' ';
        current += 1;
      }
      continue;
    }

    int slashes = 0;
    size_t j = 0;
    if (hp) {
      current[j] = '"';
      j++;
    }
    char *s = string_array[i];
    for (size_t k = 0; k < rlen; k++) {
      switch (s[k]) {
      case '\\':
        slashes++;
        current[j] = s[k];
        break;
      case '"': {
        for (; slashes > 0; slashes--) {
          current[j] = '\\';
          j++;
        }
        current[j] = '\\';
        j++;
        current[j] = s[k];
      } break;
      default:
        slashes = 0;
        current[j] = s[k];
        break;
      }
      j++;
    }
    if (hp) {
      for (; slashes > 0; slashes--) {
        current[j] = '\\';
        j++;
      }
      current[j] = '"';
      j++;
    }
    current += part_length;
    // We add a space after every part of the string except for the last one.
    if (i < array_length - 1) {
      *current = ' ';
      current += 1;
    }
  }
  *current = '\0';
  *result = string_out;
  return true;
}

int main(int argc, char **argv) {

  char *out = NULL;
  if (argv_escape_join(argv, argc, &out)) {
    fprintf(stderr, "%s\n", out);
    free(out);
  }
  return 0;
}
DaanDeMeyer commented 5 years ago

I've added argument escaping based on https://blogs.msdn.microsoft.com/twistylittlepassagesallalike/2011/04/23/everyone-quotes-command-line-arguments-the-wrong-way/ and added a simple test to verify it works.