Lazy initialization of subcommands

KalleOlaviNiemitalo commented 1 year ago

Is there a way to save startup time by not initializing subcommands other than the one chosen by the user? The parser surely needs the name and aliases of every subcommand, but I hope the following could be skipped:

description of the subcommand, unless it is needed for help or completions
validators
handler
arguments and options
nested subcommands

OTOH, I have not measured how much time these actually cost.

Might help with https://github.com/dotnet/command-line-api/issues/1008 as well.

jonsequitur commented 1 year ago

@adamsitnik could say more about whether this was a performance issue. I suspect it's not a bottleneck at this point.

That said, lazy initialization is used in dotnet new because the leaf nodes of the parser are added based on user-installed templates. The overall experience needed to be dynamic. @vlada-shubina could say more about this.

adamsitnik commented 1 year ago

Our startup scenario (https://github.com/adamsitnik/commandline-perf/tree/update) is mostly dominated by JITing and type loading (I've verified that with a profiler).

If we take a look at the following benchmark results we can see that:

the difference between an Empty app and simple S.CL app with default settings is 22.5 ms (87.5-65)
when the app is precompiled with NativeAOT, the difference for Empty app vs simple is 2 ms (14.88-13)
when an app is precompiled with R2R, we save 8.5 ms (87.5-79)

Method	Args	Mean	Error	StdDev	Median	Ratio	RatioSD
Empty	--bool -s test	65.06 ms	0.409 ms	1.205 ms	64.84 ms	0.74	0.02
EmptyAOT	--bool -s test	13.01 ms	0.139 ms	0.411 ms	12.94 ms	0.15	0.00
Corefxlab	--bool -s test	84.97 ms	0.791 ms	2.333 ms	84.66 ms	0.97	0.03
SystemCommandLine2021	--bool -s test	117.71 ms	0.624 ms	1.841 ms	117.39 ms	1.34	0.02
SystemCommandLine2022	--bool -s test	99.46 ms	0.375 ms	1.107 ms	99.15 ms	1.14	0.02
SystemCommandLineNow	--bool -s test	87.58 ms	0.332 ms	0.979 ms	87.46 ms	1.00	0.00
SystemCommandLineNowR2R	--bool -s test	79.16 ms	0.341 ms	1.006 ms	79.17 ms	0.90	0.02
SystemCommandLineNowAOT	--bool -s test	14.88 ms	0.170 ms	0.503 ms	14.85 ms	0.17	0.01

That is why I've been pushing so hard on removing all redundant types.

I don't expect us to gain a lot by making more things lazy. Our next perf goal is to include a precompiled version of S.CL in .NET SDK, so everybody would use a pre-compiled version out of the box.

Having said that, I am going to close the issue. Thank you for your feedback @KalleOlaviNiemitalo !

KalleOlaviNiemitalo commented 1 year ago

The benchmark apps don't have any subcommands, do they? So they don't show how much time would be spent constructing the subcommands and their options and arguments. The descriptions of options and arguments are not localised either so the benchmark does not include a resource lookup cost that could be skipped when a different subcommand is used, or when help is not requested.

adamsitnik commented 1 year ago

The benchmark apps don't have any subcommands, do they? So they don't show how much time would be spent constructing the subcommands and their options and arguments.

All the major allocations were already removed and those which remained are lazy (collections are allocated when needed, not up-front).

The benchmark apps don't have any subcommands, do they? So they don't show how much time would be spent constructing the subcommands and their options and arguments.

The resource lookup should be cheap.

@KalleOlaviNiemitalo I encourage you to take one of your real life apps and profile it and share the findings. I always happy to improve perf, I just don't believe that the parts that you have mentioned are major perf bottlenecks.

dotnet / command-line-api

Lazy initialization of subcommands #1956