MichalStrehovsky / iltrim

MIT License
9 stars 1 forks source link

Use System.CommandLine for cmdline parsing #99

Closed vitek-karas closed 2 years ago

vitek-karas commented 2 years ago

The most important part - this automatically supports response files (@file.rsp). Makes it easier to add additional command line options. Also gives help and such...

vitek-karas commented 2 years ago

Going to merge this to avoid juggling branches.

MichalStrehovsky commented 2 years ago

We have a Command line parser in src\coreclr\tools\Internal.CommandLine. It's what crossgen2 and NativeAOT use. It's the "old" System.CommandLine from corefxlab.

We attempted to use the "new" System.CommandLine in crossgen2. It was a big mistake because for short runs of crossgen2, we spent 20% of wallclock time parsing 5-10 command line arguments (no, seriously).

The people in charge of the "new" System.CommandLine are not interested in making the perf not suck: https://github.com/dotnet/command-line-api/issues/929

stephentoub commented 2 years ago

The people in charge of the "new" System.CommandLine are not interested in making the perf not suck

I read the responses there as being interested, e.g. https://github.com/dotnet/command-line-api/issues/929#issuecomment-644491294, and it just hasn't been the top priority. It sounds like there's known low-hanging fruit that can be addressed? Have you looked at contributing to fix some of those issues?

MichalStrehovsky commented 2 years ago

I read the responses there as being interested, e.g. dotnet/command-line-api#929 (comment), and it just hasn't been the top priority. It sounds like there's known low-hanging fruit that can be addressed? Have you looked at contributing to fix some of those issues?

A closed issue means "not interested" in my book. There are many older issues in the command-line-api repo still open that are obviously not a top priority. This one was closed.

The incentives are set up in odd ways in that repo because what appears to be getting benchmarked there is steady state throughput of command argument parsing and yes, expression trees are very fast once they're compiled. The target customers for System.CommandLine are people who need to parse millions or command line arguments and as such it's not a good fit for tools like crossgen2 or ILTrim because we have at most a couple dozen arguments. Removing expression trees would regress the benchmarks.

stephentoub commented 2 years ago

I read their response as ones of interest. It being optimized for something other than you desire it to be optimized for is very different from the owners not caring or "the perf sucks". If it's demonstrably optimized for the wrong things, let's work with the owners to fix that. We should be able to use our own System.* command line parser from our own tools. Let's help make that possible.

jonsequitur commented 2 years ago

@MichalStrehovsky Please don't interpret a closed issue as meaning there's no interest in performance improvements. I closed it because perf work is always ongoing. We're never going to say it's "fast enough" and so in the absence of a clear done definition, I chose to close the issue to reduce noise. I apologize for not making my reasoning clearer.

Since that time, a good deal of performance work has been done. It's mostly been low-hanging fruit but it's cut wall-clock time for parsing and binding to 50% of what it was and reduced the memory footprint by around 25%. Some APIs that lend themselves to worse performance (e.g. relying on reflection) are now deprecated. We'd be more than happy for feedback including on designing more meaningful benchmarks.

MichalStrehovsky commented 2 years ago

We're never going to say it's "fast enough" and so in the absence of a clear done definition, I chose to close the issue to reduce noise.

The benchmark is the command line parser from CoreFXLab that runtime tools use right now. If the new parser is not at least as fast as this parser, switching to it would be a regression. The goal of what is "fast enough" was captured in the issue and it could have been measured before closing it.

I took Jan's sample from dotnet/command-line-api#929 that establishes the goals, merged @jonsequitur's pull request into that, upgraded everything to Net6.0 and latest System.CommandLine (2.0.0-beta1.21308.1), added /p:PublishReadyToRun=true to publish to make it more realistic (anyone who cares about startup perf is going to do that) and copied Internal.CommandLine from the runtime repo because I had trouble restoring the nugets (I think we stopped paying for myget where this was hosted). Here's how it stands right now:

Total ms
empty 58 ms
corefxlab 71 ms
command-line-api 110 ms
command-line-api-2 91 ms
command-line-api-3 84 ms

So it's still at least 2x slower than the corefxlab parser.

Honestly, the corefxlab parser taking 13 ms to parse two arguments is a low bar but the ergonomics beats parsing things manually. It would be interesting to compare this with Mono.Options at some point which is another mature command line parser in the ecosystem.

jonsequitur commented 2 years ago

For what it's worth, "command-line-api-3" is the appropriate benchmark to compare. "command-line-api" and "command-line-api-2" include a number of quality-of-life features that are lacking from the corefxlab parser entirely (e.g. complex object binding, completion support, typo suggestions). The wall clock time for "command-line-api-3" is 1.2X that of the corefxlab parser, additional improvements have been made since that version, and I have every expectation that System.CommandLine will outperform the corefxlab parser before long. I'll provide some better benchmarks.