gsscoder / commandline

Terse syntax C# command line parser for .NET with F# support
1.63k stars 293 forks source link

Time to parse N args scales quadratically with N, can result in very long parse times under certain conditions #457

Open Inirit opened 7 years ago

Inirit commented 7 years ago

Hello, I've been using this parsing library for a project I'm working on and I've come across a condition where the parsing function was taking a very long time to complete. With a large number of elements in the input arguments array (a few hundred at least), parsing was taking multiple minutes to complete. Having a few hundred argument array elements could probably be considered a weird edge case condition and the manner in which I'm using the library isn't exactly the same as the intended purpose of parsing command line app arguments, but I figured I'd bring it up anyway if it's considered worth further investigation.

I narrowed down the delay to definitely be within this libs code and not a behavior in my own code, but I wanted to also find out exactly how the parsing performance was scaling. I wrote up a simple C# project that demonstrates how an increasing number of arg array elements impacts the time it takes to parse from the array. The source code and an example of the results are below:

Repro Source (.Net Core 1.1 with CommandLineParser 2.1.1-beta) CommandLineRepro.zip

Example image

In writing this repro project, I also learned that the parsing time scales linearly with the number of public fields available in the target parse object (which sounds reasonable). The above example was performed with an arguments object that had only one field. To consider the results of an object that had 10 fields you could multiply those times by 10, in this scenario the time to parse would reach multiple seconds shortly after the args array exceeds 32 in length and will quickly ramp up from there.

deng0 commented 7 years ago

I've noticed the same problem.

The main problem is that the library creates enumerables and then enumerate those multiple times. I've created a fork where I fixed the problem.

https://github.com/deng0/commandline/commit/95ff7f6ca235bf0fb3778b6bb77b9e6f1193f058