dotnet / command-line-api

Command line parsing, invocation, and rendering of terminal output.
https://github.com/dotnet/command-line-api/wiki
MIT License
3.37k stars 378 forks source link

Use 'CommandLineStringSplitter.Instance.Split' parse bug. #1740

Open treenewlyn opened 2 years ago

treenewlyn commented 2 years ago
var raw = "\"dotnet publish \\\"xxx.csproj\\\" -c Release -o \\\"./bin/latest/\\\" -r linux-x64 --self-contained false\"";

var array = System.CommandLine.Parsing.CommandLineStringSplitter.Instance.Split(raw).ToArray();

Console.WriteLine(array.Length);

Its parsed array:

  1. "dotnet publish \"
  2. "xxx.csproj\ -c Release -o \./bin/latest\ -r linux-x64 --self-contained false"

But expected array like:

  1. "dotnet publish \"xxx.csproj\" -c Release -o \"./bin/latest\" -r linux-x64 --self-contained false"
jonsequitur commented 2 years ago

Can you explain a bit more about what you're trying to do and why this output is your expectation? Also, examples using precise strings without the C# escaping would be a little clearer, I think.

For context, the CommandLineStringSplitter is intended to reproduce the way command line input to a .NET console app is split into the args array that gets passed to Main.

Let's use a Program.cs containing this to verify the behavior this is designed to reproduce:

foreach(var arg in args)
{
    Console.WriteLine(arg);
}

Your raw variable contains the following actual, unescaped string:

"dotnet publish \"xxx.csproj\" -c Release -o \"./bin/latest/\" -r linux-x64 --self-contained false"

Running the above program from the command line in PowerShell (keeping in mind that these examples will differ in other shells) with that string produces this output:

dotnet publish \
xxx.csproj\ -c Release -o \./bin/latest/\ -r linux-x64 --self-contained false

So that looks like it's working as designed, but at least in this example, it's probably not what you're really looking for.

treenewlyn commented 2 years ago

using System.CommandLine;
using System.CommandLine.NamingConventionBinder;

var rootCommand = new RootCommand("A set command.");
rootCommand.Name = "SET";
rootCommand.AddArgument(new Argument()
{
    Name = "key",
    ValueType = typeof(string),
    Description = "A string"
});

rootCommand.AddArgument(new Argument()
{
    Name = "value",
    ValueType = typeof(string),
    Description = "A string"
});

rootCommand.Handler = CommandHandler.Create<SetCommand>(cmd =>
{
    return cmd.InvokeAsync();
});

while (true)
{
    Console.Write("> ");
    var line = Console.ReadLine();
    if (line == null || (line = line.Trim()).Length == 0) continue;
    if (line == "exit") break;

    try
    {
        await rootCommand.InvokeAsync(line);
    }
    catch (Exception ex)
    {
        Console.WriteLine(ex);
    }
}

class SetCommand
{
    public string Key { get; set; } = null!;

    public string? Value { get; set; } = null!;

    public Task<int> InvokeAsync()
    {
        Console.WriteLine("Key: {0}", this.Key);
        Console.WriteLine("Value: {0}", this.Value);
        return Task.FromResult(0);
    }
}

So, When i call SET text abc, it's work. But, when i want to set a json value, not working.

Input

text abc
say Hello\"
json {\"a\":1}
json "{\"a\":1}"
json {"a":1}

Output

> text abc
Key: text
Value: abc
> say Hello\"
Key: say
Value: Hello\
> json {\"a\":1}
Key: json
Value: {\a\:1}
> json "{\"a\":1}"
Unrecognized command or argument 'a\:1}'.

Description:
  A set command.

Usage:
  SET <key> <value> [options]

Arguments:
  <key>    A string
  <value>  A string

Options:
  --version       Show version information
  -?, -h, --help  Show help and usage information

> json {"a":1}
Key: json
Value: {a:1}

How set char " in Value argument?

jonsequitur commented 2 years ago

This is complicated and hard to talk clearly about. 😅

Going back to the previous example and taking System.CommandLine out of the picture for a moment, the following would work for PowerShell when starting your app (i.e. for the args values passed to Main):

> json '"{\"a\":1}"'

Here's what's happening:

If you were to now pass that args array to e.g. rootCommand.InvokeAsync(args), the CommandLineStringSplitter never even gets called, because the split has already happened before Main. (CommandLineStringSplitter is typically only used in testing or when calculating completions).

But, since you're building more of a REPL-style interaction, this shell escaping won't affect the value you get from Console.ReadLine. You'll get the exact string back, including the double and single quotes. When it's passed to CommandLineStringSplitter.Split, that method assumes this is command line input and tries to treat the quotes as delimiters for the command line, but that's not what they represent inside this JSON block. Since you know it's JSON, you might consider an alternative way to parse it, because otherwise your users will have to escape the quotes inside the JSON, which is not intuitive.

treenewlyn commented 2 years ago

OK. I see. share my code

    /// <summary>
    /// 表示一个命令行的解析器。
    /// </summary>
    public static class CommandLineParser
    {
        private static bool TryReadFirstChar(this StringReader reader, out char c)
        {
            var i = reader.Read();
            if (i == -1)
            {
                c = char.MinValue;
                return false;
            }
            else
            {
                c = (char)i;
                if (char.IsWhiteSpace(c)) return reader.TryReadFirstChar(out c);
                return true;
            }
        }

        private static IEnumerable<char> ParseToken(StringReader reader)
        {
            if (!reader.TryReadFirstChar(out var c)) yield break;

            var isQueteString = false;
            var qc = char.MinValue;
            if(c is '=' or ':')
            {
                yield break;
            }
            else if (c is '\"' or '\'')
            {
                isQueteString = true;
                qc = c;
                if (reader.Peek() == -1)
                {
                    throw new InvalidDataException("Invalid quete in the string.");
                }
            }
            else
            {
                yield return c;
            }
            int i;
            while (true)
            {
                i = reader.Read();
                if (i == -1) break;
                c = (char)i;
                if (isQueteString)
                {
                    var pi = reader.Peek();
                    if (pi == -1) throw new InvalidDataException("Invalid quete in the string.");

                    var peek = (char)pi;
                    if (peek == qc)
                    {
                        reader.Read();
                        if (c == '\\')
                        {
                            yield return peek;
                        }
                        else
                        {
                            yield return c;
                            yield break;
                        }
                    }
                    else if (c == '\\' && peek == '\\')
                    {
                        reader.Read();
                        yield return peek;
                    }
                    else
                    {
                        yield return c;
                    }
                }
                else
                {
                    if (char.IsWhiteSpace(c) || (c is ':' or '=')) yield break;
                    yield return c;
                }
            }
        }

        /// <summary>
        /// 解析指定的命令行。
        /// </summary>
        /// <param name="commandLine">命令行。</param>
        /// <returns>一个命令行参数的列表。</returns>
        public static IEnumerable<string> Parse(string commandLine)
        {
            if (string.IsNullOrWhiteSpace(commandLine)) yield break;
            commandLine = commandLine.Trim();
            using var reader = new StringReader(commandLine);
            do
            {
                var chars = ParseToken(reader).ToArray();
                if (chars.Length == 0) continue;
                if (chars.Length == 1 && (chars[0] is ':' or '=')) continue;
                var arg = new string(chars);
                yield return arg;
            } while (reader.Peek() != -1);
        }
    }

xunit

        [Fact]
        public void AllTest()
        {
            Assert.Equal(new string[] { "text", "abc" }
            , CommandLineParser.Parse("text abc"));

            Assert.Equal(new string[] { "text", "Hello\"" }
            , CommandLineParser.Parse("text Hello\""));

            Assert.Equal(new string[] { "text", "Hello\"" }
            , CommandLineParser.Parse(" \t text   Hello\" \t  "));
        }

        [Fact]
        public void QueteTest()
        {
            var args1 = "\"{\\\"a\\t\\\":1}\"";
            Assert.Equal(new string[] { "text", "{\"a\\t\":1}" }
            , CommandLineParser.Parse("text " + args1).ToArray());
        }

        [Fact]
        public void Quete2Test()
        {
            var args1 = "'{\"a\\t\":1}'";
            Assert.Equal(new string[] { "text", "{\"a\\t\":1}" }
            , CommandLineParser.Parse("text " + args1).ToArray());
        }

        [Fact]
        public void SetTest()
        {
            Assert.Equal(new string[] { "a", "b", "c", "d", "e", "f", "g", "h" }, CommandLineParser.Parse("a=b c =d e = f g= h").ToArray());
            Assert.Equal(new string[] { "a", "b", "c", "d", "e", "f", "g", "h" }, CommandLineParser.Parse("a:b c :d e : f g: h").ToArray());
            Assert.Equal(new string[] { "a", ":b" }, CommandLineParser.Parse(" a= ':b'").ToArray());
        }
jonorossi commented 1 year ago

If you were to now pass that args array to e.g. rootCommand.InvokeAsync(args), the CommandLineStringSplitter never even gets called, because the split has already happened before Main. (CommandLineStringSplitter is typically only used in testing or when calculating completions).

Thanks @jonsequitur for the insight.

I'm not using CommandLineStringSplitter, but am passing JSON into an application and was running into the same issues. I determined that passing a single-quoted string that contains double quotes results in PowerShell stripping the double quotes before the C# application even gets it's args, while cmd does not. I'll just have to use cmd to run this command in my application.