omni-us / jsonargparse

Implement minimal boilerplate CLIs derived from type hints and parse from command line, config files and environment variables
https://jsonargparse.readthedocs.io
MIT License
319 stars 47 forks source link

Nested class instantiation fails for classes with typehints unsupported by `jsonargparse` #337

Closed speediedan closed 1 year ago

speediedan commented 1 year ago

πŸ› Bug report

When attempting to instantiate a valid class configuration with a class that has typehints unsupported by jsonargparse (in my application's case, ForwardRefs), the valid configuration is rejected because the unsupported typehints cannot be validated.

Though the user could technically subclass the problematic class to remove the typehints or attempt to workaround the issue in a variety of ways, I think the default behavior (or at least optionally allowable behavior) should be to relax unsupported typehints rather than reject their use entirely.

In the context of attempting to instantiate an arbitrary class, I think the typical user would prefer flexibility over guaranteed typehint validation. If not the default behavior, I think toggling this behavior with either an environment variable or command-line option should be allowed since working around such issues can otherwise be unnecessarily unwieldy. This is arguably more of a feature request than a bug report feel free to change the title/classification as you deem appropriate. Thanks again for all you work on this valuable package!

To reproduce

I've written a test that can be used both to reproduce the issue and validate one approach to solving it (described below).

  1. Add the relevant test to desired test module:
    
    class NestedWithParamsUnsuppTypehint:
    def __init__(self, p1: int, inner_mod: Optional["InnerClassFwdRef"]):  # noqa: F821
        self.p1 = p1
        self.inner_mod = inner_mod

class InnerClassFwdRef: def init(self, p2: int): self.p2 = p2

def test_add_class_nested_with_unsupported_typehints(parser, logger): parser.logger = logger parser.add_argument("--cfg", action=ActionConfigFile) config = f"""outer: class_path: {name}.NestedWithParamsUnsuppTypehint init_args: p1: 2 inner_mod: class_path: {name}.InnerClassFwdRef init_args: p2: 3 """ parser.add_argument("--outer", type=Any) config_path = Path("config.yaml") config_path.write_text(config) with capture_logs(logger) as logs: cfg = parser.parse_args([f"--cfg={config_path}"]) assert not 'No action for destination key "inner_mod" to check its value' in logs.getvalue() assert "Relaxing unsupported type hint: Unsupported type hint typing.Optional[ForwardRef('InnerClassFwdRef')]" in logs.getvalue() init = parser.instantiate_classes(cfg) assert isinstance(init.outer, NestedWithParamsUnsuppTypehint) assert init.outer.p1 == 2 assert isinstance(init.outer.inner_mod, InnerClassFwdRef) assert init.outer.inner_mod.p2 == 3

2. The test should initially fail with the `No action for destination key` error
3. After applying the patch suggested below, the test should succeed. I'm not submitting a PR with this test/POC resolution as I recognize there are many approaches possible to addressing this concern and I suspect you will have a preferred implementation. 
https://github.com/omni-us/jsonargparse/blob/6232d8e7e231b7cf1d7ecfe1d0446a2442ee7967/jsonargparse/_typehints.py#L218 :
```python
        # kwargs["action"] = ActionTypeHint(typehint=typehint, enable_path=enable_path, logger=logger)
        try:
            kwargs["action"] = ActionTypeHint(typehint=typehint, enable_path=enable_path, logger=logger)
        except ValueError as ex:
            logger.debug("Relaxing unsupported type hint from args: " + str(ex))
            kwargs["action"] = ActionTypeHint(typehint=Optional[Any], enable_path=enable_path, logger=logger)
            return args
        return args

Expected behavior

Rather than throwing a No action for destination key "inner_mod" to check its value error, inner_mod should be properly instantiated and the test should pass with the desired user feedback provided indicating an unsupported typehint was relaxed (or removed if that approach is preferred etc.).

Environment

speediedan commented 1 year ago

@mauvilsa Any feedback regarding this? I'm sure you've got a lot on your plate. Thanks again for maintaining this incredibly useful package!

mauvilsa commented 1 year ago

Indeed I have been busy. I do have some comments.

speediedan commented 1 year ago

Indeed I have been busy. I do have some comments.

No worries, thanks for the thoughtful reply!

  • An unwritten principle of jsonargparse has been: if a typehint is defined and is valid, then use it. The code says that the inner_mod parameter accepts InnerClassFwdRef instances. So jsonargparse should use that instead of arbitrarily assigning Any.

After your feedback, returning to the original issue I was debugging (not the simplified pytest repro I wrote but using jsonargparse with the LightningCLI) I noticed the root of the issue was Lightning attempting to reduce overhead at runtime by conditioning some imports with typing.TYPE_CHECKING. I'm going to open an issue on the Lightning side to remove that condition on the import unless you have a better suggestion that could allow them to continue to condition the type imports that way (seems like that modest overhead will be necessary to me)

  • What would be missing in jsonargparse is resolving of forward refs in the types. Actually, this should already work in older python versions (<=3.9) because types are backported and this should take care of the forward ref.
  • jsonargparse now has support for from __future__ import annotations. Maybe in this case it could be recommended to start using future annotations, and avoid ugly source code with types defined as strings.

Nice! I'll definitely keep that in mind, I think Lightning should do that to avoid the ugly strings still being used in many places, not sure if you've discussed that with them but given they aren't supporting Python < 3.7, I don't see a reason they couldn't use that more extensively (once the current version of jsonargparse that supports this can be reasonably set as a minimum dependency).

  • Even if a type is defined as a string, and future annotations is used, the types should be resolved. If this is not working, then that should be fixed.
  • If unsupported types are always assigned Any, then there is no push for higher code quality. If there is a bug in a type, developers should notice and fix the type. If by default all types are accepted, then people will not report unsupported types to improve jsonargparse.

Makes sense, I was thinking an optional trade-off (possibly enabled with an env variable) of providing a user-facing warning to highlight the issue but then relaxing the type constraint and allowing execution was a compromise that could continue offering much of the rigor/value jsonargparse does but with more flexibility in some situations when explicitly requested. I recognize now more how antithetical that approach may be to the core value proposition of jsonargparse though.

Anyway, please let me know if you have any thoughts on a better workaround other than suggesting Lightning remove the typing.TYPE_CHECKING condition I referenced above. If not, I'll go ahead and close this issue and open the relevant one with Lightning. Thanks again!

mauvilsa commented 1 year ago

After your feedback, returning to the original issue I was debugging (not the simplified pytest repro I wrote but using jsonargparse with the LightningCLI) I noticed the root of the issue was Lightning attempting to reduce overhead at runtime by conditioning some imports with typing.TYPE_CHECKING.

Importing types inside a typing.TYPE_CHECKING block is a valid way of doing typing annotations. This is a not-yet-supported case in which jsonargparse should resolve the actual types. Technically it is possible to add support for it.

I'm going to open an issue on the Lightning side to remove that condition on the import unless you have a better suggestion that could allow them to continue to condition the type imports that way (seems like that modest overhead will be necessary to me)

Nice! I'll definitely keep that in mind, I think Lightning should do that to avoid the ugly strings still being used in many places, not sure if you've discussed that with them but given they aren't supporting Python < 3.7, I don't see a reason they couldn't use that more extensively.

I tried to add future annotations to lightning but it is not so easy. First I tried for the entire lightning package and there were many issues. For example one problem was that pydantic does not backport PEP 585 and 604 types in python 3.7-3.9, so things fail. I also tried only for lightning.pytorch which doesn't use pydantic, but this makes torch.jit.script fail, see https://github.com/Lightning-AI/lightning/pull/17779. I gave up after that.

It might be possible to add future annotations to lightning.pytorch more gradually. Like one file at a time, making sure that things don't break.

speediedan commented 1 year ago

Importing types inside a typing.TYPE_CHECKING block is a valid way of doing typing annotations. This is a not-yet-supported case in which jsonargparse should resolve the actual types. Technically it is possible to add support for it.

While waiting for this use case to be supported, do you have any suggested workaround that would allow this fine-tuning profiling example for (finetuning-scheduler), to continue using the LightningCLI? The example was working prior to the introduction of a couple ForwardRef typehints a couple weeks ago so it seems a shame for that minor change to preclude use of the CLI. To workaround the issue currently, I'm just subclassing the relevant class (FSDPStrategy) to adjust the unsupported typehints but would love to know from the CLI guru if there's a better way! :tada:

error: Parser key "trainer.strategy":
  Does not validate against any of the Union subtypes
  Subtypes: (<class 'str'>, <class 'lightning.pytorch.strategies.strategy.Strategy'>)
  Errors:
    - Expected a <class 'str'>
    - Problem with given class_path 'lightning.pytorch.strategies.FSDPStrategy':
        'Configuration check failed :: No action for destination key "activation_checkpointing_policy" to check its value.'

Regarding the future support for this use case, feel free to rename this issue so it better captures the issue from your perspective or let me know if I need to close this issue and open a new one.

Thanks so much again for work/thoughts on this!

I tried to add future annotations to lightning but it is not so easy. First I tried for the entire lightning package and there were many issues. For example one problem was that pydantic does not backport PEP 585 and 604 types in python 3.7-3.9, so things fail. I also tried only for lightning.pytorch which doesn't use pydantic, but this makes torch.jit.script fail, see Lightning-AI/lightning#17779. I gave up after that.

It might be possible to add future annotations to lightning.pytorch more gradually. Like one file at a time, making sure that things don't break.

Makes sense, I see the challenge!

mauvilsa commented 1 year ago

While waiting for this use case to be supported, do you have any suggested workaround that would allow this fine-tuning profiling example for (finetuning-scheduler), to continue using the LightningCLI?

Subclassing is the only workaround that comes to mind. Anyway, adding support for typing.TYPE_CHECKING is not too complex. I will take a look at it today.

speediedan commented 1 year ago

Anyway, adding support for typing.TYPE_CHECKING is not too complex. I will take a look at it today.

That's awesome, should be really useful. Thanks!

mauvilsa commented 1 year ago

Still work in progress, but you can try it out #344.

speediedan commented 1 year ago

Works beautifully, thanks so much for the edifying design discussion and fix! :rocket: