omni-us / jsonargparse

Implement minimal boilerplate CLIs derived from type hints and parse from command line, config files and environment variables
https://jsonargparse.readthedocs.io
MIT License
322 stars 47 forks source link

Expose instantiator API #547

Closed wrongu closed 1 week ago

wrongu commented 3 months ago

🚀 Feature request

Expose instantiator logic outside of jsonargparse:

from jsonargparse import instantiate

spec = {
    "class_path": "foo.Bar",
    "init_args": {"a": 1, "b": 2, "c": 3}
}

my_bar = instantiate(spec)

Motivation

My workflow is, I think, a common one: I use the CLI to run a job and save metadata. I later want to programatically restore objects from that run. A very natural way to do this is to load the saved config and create new object instances from that config.

Alternatives

  1. I've hacked my own for now. It's ugly and duplicates existing function hidden inside jsonargparse:

    def instantiate(model_class: str, init_args: dict) -> object:
        """Take a string representation of a class and a dictionary of arguments and instantiate the
        class with those arguments.
        """
        model_package, model_class = model_class.rsplit(".", 1)
        cls = getattr(importlib.import_module(model_package), model_class)
        # inspect the signature of the class constructor and check types
        sig = inspect.signature(cls.__init__)
        for param_name, param_value in init_args.items():
            param_type = sig.parameters[param_name].annotation
            if param_type == inspect.Parameter.empty:
                raise ValueError(f"Parameter {param_name} is missing annotation in {cls}")
            if not isinstance(param_value, param_type):
                try:
                    # Attempt to cast the parameter to the correct type
                    init_args[param_name] = param_type(param_value)
                except Exception as e:
                    raise ValueError(
                        f"Parameter {param_name} should be of type {param_type} but got {param_value}"
                    ) from e
        return cls(**init_args)
  2. An alternative is to switch to Hydra, which supports this with hydra.utils.instantiate.
  3. There are tests which go dict --> CLI arguments --> parser --> object. This is not a great solution either.
mauvilsa commented 3 months ago

Thanks for the proposal!

My first question would be if you truly need a function that instantiates whatever it gets. I am not sure this is a good idea. This kind of opens the door to a config injection vulnerability.

If you don't require a function that instantiates anything, then the normal jsonargparse flow works:

parser (from function/class signature) -> parse -> instantiate

That is, you can implement your own instantiator function for example like:

def instantiate(spec: dict) -> BaseClass:
    parser = ArgumentParser(exit_on_error=False)
    parser.add_argument("--cls", type=BaseClass)
    cfg = parser.parse_object({"cls": spec})
    init = parser.instantiate_classes(cfg)
    return init.cls

This would instantiate any subclass of BaseClass.

wrongu commented 3 months ago

That is much much better than my solution. Thanks!

Still, I'd propose including your instantiate method in the package, or including that snippet in the docs for those like me googling "can jsonargparse do the equivalent of hydra.utils.instantiate?"

wrongu commented 1 week ago

Closing this. Here's what I ended up with:

import importlib
import jsonargparse

def instantiate(model_class: str, init_args: dict) -> object:
    """Take a string representation of a class and a dictionary of arguments and instantiate the
    class with those arguments.
    """
    model_package, model_class = model_class.rsplit(".", 1)
    cls = getattr(importlib.import_module(model_package), model_class)

    parser = jsonargparse.ArgumentParser(exit_on_error=False)
    parser.add_class_arguments(cls, nested_key="obj", instantiate=True)
    parsed = parser.parse_object({"obj": init_args})
    return parser.instantiate_classes(parsed).obj