Running annotation-based picocli applications without runtime reflection

remkop commented 5 years ago

The annotation processor implemented in #500 can create a CommandSpec model at compile time. One use case for an annotation processor like this is to generate source code at compile time that would allow picocli-based applications to run without runtime reflection.

This ticket is to explore some ideas for achieving that.

One idea is to generate code (actually modify an existing annotated class) to implement an interface, where a method in this interface generates a CommandSpec model. Picocli needs to be modified to not perform reflection when the user object implements this interface, and instead call the interface method to obtain the CommandSpec.

Example input:

@picocli.codegen.GenerateModel
class App {
    @Option(names = "-x")
    private int x;

    public static void main(String[] args) {
        CommandLine cmd = new CommandLine(new App());
        cmd.parseArgs(args);
    }
}

Example output of annotation processor:

// after annotation processing:
class App implements picocli.CommandLine.Model.ICommandSpecFactory {
    @Option(names = "-x")
    private int x;

    public static void main(String[] args) {
        CommandLine cmd = new CommandLine(new AppV1());
        cmd.parseArgs(args);
    }

    // ICommandSpecFactory implementation (generated)
    public CommandSpec getCommandSpec() {
        CommandSpec result = CommandSpec.wrapWithoutInspection(this);
        result.addOption(OptionSpec.builder("-x").type(int.class)
            .getter(new IGetter() {
                public Object get() {
                    return App.this.x;
                }
            })
            .setter(new ISetter() {
                public Object set(Object newValue) {
                    Object old = App.this.x;
                    App.this.x = (Integer) newValue;
                    return old;
                }
            })
            .build());

        // other options
        return result;
    }
}

Lombok does something similar: it inserts accessor code into an existing class for fields annotated with @Getter and @Setter. The way Lombok does this is by using internal APIs from Javac and the Eclipse compiler.

remkop commented 5 years ago

See https://github.com/rzwitserloot/lombok/blob/9198551defb7dd71d872c7b86af0a3f0bf0ec545/src/core/lombok/javac/handlers/HandleSetter.java and https://github.com/rzwitserloot/lombok/blob/9198551defb7dd71d872c7b86af0a3f0bf0ec545/src/core/lombok/eclipse/handlers/HandleSetter.java

remkop commented 5 years ago

Current thinking is to generate a subclass that can get/set protected or package-private fields and methods in the annotated superclass. This is a reasonable restriction that is more maintainable than using internal compiler APIs.

Related: #750, #1003

kristofdho commented 3 years ago

Hi @remkop, is there any progress on this? Do you maybe have an estimation on when you would get around to implementing this? We are using picocli in combination with GraalVM native-image. We have multiple entrypoints with different sets of supported commands pulled from a shared pool. However with the current config generation, all commands are seen as reachable to the native-image compiler. With the suggested codegen, automatic reachabillity analysis would correctly remove all unused commands, reducing our final executable size.

remkop commented 3 years ago

Hi @kristofdho I am not currently working on this and I don't see myself working on this in the near future. So I cannot give any estimate.

However, this message that was posted on the picocli mailing list may be relevant, so let me copy it here:

On Tuesday, July 6, 2021 at 9:14:03 PM UTC+9 quin...@gmail.com wrote:

Hi,

Just in case this might interest anyone, I created an annotation processor that takes an existing project that uses PicoCli annotations and generates code that uses the PicoCli API to recreate the same model:

https://github.com/quintesse/jbang/tree/picocli_annoproc

The code isn't finished because it only implements the PicoCli features we were using in our own project (but we're using quite a number). But it might be useful as a starting point for someone who'd like to do something similar.

Also the code quality is PoC-level because I was only trying to see if model creation time would improve using the API (in our case PicoCli setup is 3/4 of our app startup time so finding a way to improve this is pretty important to us). Unfortunately the gains were pretty minimal so I'll not be working on this any further.

The code for the annotation processor can be found in ... drum roll... /annotation.

Cheers! -Tako (Jbang contributor)

I currently have very little time to spend on picocli, but if this is something you want to work on, we can look at bringing this to production quality and integrating this into the picocli project somehow.

kristofdho commented 3 years ago

@remkop Thank you for the quick reply.

For now our best option would probably be writing a feature that programatically registers the required reflection configuration based on reachabillity hooks.

However, could you shed some light on when the reflective lookup calls happen? The CommandLine class is, to put it lightly, quite complex. For what I could follow, it looks like Field references get stored in the CommandLine instance while parsing the annotations. For simply running CommandLine#parseArgs and CommandLine#usage calls after the setup, are there any reflective lookups required? If that would be the case, we could simply cache the CommandLine instance at build-time and we wouldn't need any configuration at all, removing all setup overhead as well.

remkop commented 3 years ago

@kristofdho Reflection happens at two times:

at initialization time to create the model (the hierarchy of CommandSpec, OptionSpec etc instances)
during parsing (parseArgs), after options and positional parameters were matched: the IGetter and ISetter bindings that picocli generates use reflection to set the value of annotated fields or invoke the annotated methods

An annotation processor could avoid both usages: the processor could generate code that creates the model. The IGetter and ISetter for each option and positional parameter could simply set the annotated field value (or invoke the annotated method) programmatically without reflection.

The usage methods does not need to use reflection if the default value for an option can be obtained without calling the IGetter of the option. So, @Option(names = "-a", defaultValue = "abc") would be fine.

It should be possible to reuse a single CommandLine instance, but there are some edge cases (like #1010) where there may be issues.

mikehearn commented 2 years ago

May I ask a possibly stupid question - has anyone profiled PicoCLI startup time to determine that reflection actually is the problem? I'm a bit disturbed by the comment above saying that even with compile-time generated specs, startup time didn't improve much. Why would that be? Where is the time actually going?

remkop commented 2 years ago

@mikehearn That is a good question. I personally have not spent time profiling. I believe some others have, notably here: https://github.com/remkop/picocli/issues/1377

(...) comment above saying that even with compile-time generated specs, startup time didn't improve much.

Can you link to that comment? I cannot find it.

mikehearn commented 2 years ago

Also the code quality is PoC-level because I was only trying to see if model creation time would improve using the API (in our case PicoCli setup is 3/4 of our app startup time so finding a way to improve this is pretty important to us). Unfortunately the gains were pretty minimal so I'll not be working on this any further.

remkop / picocli

Running annotation-based picocli applications without runtime reflection #539