NVIDIA / earth2mip

Earth-2 Model Intercomparison Project (MIP) is a python framework that enables climate researchers and scientists to inter-compare AI models for weather and climate.
https://nvidia.github.io/earth2mip/
Apache License 2.0
187 stars 41 forks source link

🐛[BUG]: AFNO "fcn" broken in more recent versions of modulus #75

Closed nbren12 closed 10 months ago

nbren12 commented 11 months ago

Version

main

On which installation method(s) does this occur?

No response

Describe the issue

https://github.com/NVIDIA/modulus/blame/ef54c7a4c3b241c48f48695362c2688d150c6ce5/modulus/models/afno/afno.py#L414 changed the kwargs of AFNO, so the fcn package now fails. with this error

__________________________________________________________________________________________________________________________________________________________________________________________________________________ test_run_basic_inference ____________________________________________________________________________________________________________________________________________________________________________________________________________________

    @pytest.mark.slow
    def test_run_basic_inference():
>       time_loop = get_model("e2mip://fcn", device="cuda:0")

tests/test_end_to_end.py:134: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
earth2mip/networks/__init__.py:335: in get_model
    return _load_package_builtin(package, device, name=url.netloc)
earth2mip/networks/__init__.py:282: in _load_package_builtin
    return inference_loader(package, device=device)
earth2mip/networks/fcn.py:64: in load
    core_model = modulus.Module.from_checkpoint(package.get("fcn.mdlus"))
/usr/local/lib/python3.10/dist-packages/modulus/models/module.py:327: in from_checkpoint
    model = cls.instantiate(args)
/usr/local/lib/python3.10/dist-packages/modulus/models/module.py:150: in instantiate
    return _cls(**arg_dict["__args__"])
/usr/local/lib/python3.10/dist-packages/modulus/models/module.py:59: in __new__
    bound_args = sig.bind_partial(
/usr/lib/python3.10/inspect.py:3193: in bind_partial
    return self._bind(args, kwargs, partial=True)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = <Signature (self, inp_shape: List[int], in_channels: int, out_channels: int, patch_size: List[int] = [16, 16], embed_d... float = 0.0, num_blocks: int = 16, sparsity_threshold: float = 0.01, hard_thresholding_fraction: float = 1.0) -> None>, args = (None,), kwargs = {'img_size': [720, 1440]}

    def _bind(self, args, kwargs, *, partial=False):
        """Private method. Don't use directly."""

        arguments = {}

        parameters = iter(self.parameters.values())
        parameters_ex = ()
        arg_vals = iter(args)

        while True:
            # Let's iterate through the positional arguments and corresponding
            # parameters
            try:
                arg_val = next(arg_vals)
            except StopIteration:
                # No more positional arguments
                try:
                    param = next(parameters)
                except StopIteration:
                    # No more parameters. That's it. Just need to check that
                    # we have no `kwargs` after this while loop
                    break
                else:
                    if param.kind == _VAR_POSITIONAL:
                        # That's OK, just empty *args.  Let's start parsing
                        # kwargs
                        break
                    elif param.name in kwargs:
                        if param.kind == _POSITIONAL_ONLY:
                            msg = '{arg!r} parameter is positional only, ' \
                                  'but was passed as a keyword'
                            msg = msg.format(arg=param.name)
                            raise TypeError(msg) from None
                        parameters_ex = (param,)
                        break
                    elif (param.kind == _VAR_KEYWORD or
                                                param.default is not _empty):
                        # That's fine too - we have a default value for this
                        # parameter.  So, lets start parsing `kwargs`, starting
                        # with the current parameter
                        parameters_ex = (param,)
                        break
                    else:
                        # No default, not VAR_KEYWORD, not VAR_POSITIONAL,
                        # not in `kwargs`
                        if partial:
                            parameters_ex = (param,)
                            break
                        else:
                            msg = 'missing a required argument: {arg!r}'
                            msg = msg.format(arg=param.name)
                            raise TypeError(msg) from None
            else:
                # We have a positional argument to process
                try:
                    param = next(parameters)
                except StopIteration:
                    raise TypeError('too many positional arguments') from None
                else:
                    if param.kind in (_VAR_KEYWORD, _KEYWORD_ONLY):
                        # Looks like we have no parameter for this positional
                        # argument
                        raise TypeError(
                            'too many positional arguments') from None

                    if param.kind == _VAR_POSITIONAL:
                        # We have an '*args'-like argument, let's fill it with
                        # all positional arguments we have left and move on to
                        # the next phase
                        values = [arg_val]
                        values.extend(arg_vals)
                        arguments[param.name] = tuple(values)
                        break

                    if param.name in kwargs and param.kind != _POSITIONAL_ONLY:
                        raise TypeError(
                            'multiple values for argument {arg!r}'.format(
                                arg=param.name)) from None

                    arguments[param.name] = arg_val

        # Now, we iterate through the remaining parameters to process
        # keyword arguments
        kwargs_param = None
        for param in itertools.chain(parameters_ex, parameters):
            if param.kind == _VAR_KEYWORD:
                # Memorize that we have a '**kwargs'-like parameter
                kwargs_param = param
                continue

            if param.kind == _VAR_POSITIONAL:
                # Named arguments don't refer to '*args'-like parameters.
                # We only arrive here if the positional arguments ended
                # before reaching the last parameter before *args.
                continue

            param_name = param.name
            try:
                arg_val = kwargs.pop(param_name)
            except KeyError:
                # We have no value for this parameter.  It's fine though,
                # if it has a default value, or it is an '*args'-like
                # parameter, left alone by the processing of positional
                # arguments.
                if (not partial and param.kind != _VAR_POSITIONAL and
                                                    param.default is _empty):
                    raise TypeError('missing a required argument: {arg!r}'. \
                                    format(arg=param_name)) from None

            else:
                if param.kind == _POSITIONAL_ONLY:
                    # This should never happen in case of a properly built
                    # Signature object (but let's have this check here
                    # to ensure correct behaviour just in case)
                    raise TypeError('{arg!r} parameter is positional only, '
                                    'but was passed as a keyword'. \
                                    format(arg=param.name))

                arguments[param_name] = arg_val

        if kwargs:
            if kwargs_param is not None:
                # Process our '**kwargs'-like parameter
                arguments[kwargs_param.name] = kwargs
            else:
>               raise TypeError(
                    'got an unexpected keyword argument {arg!r}'.format(
                        arg=next(iter(kwargs))))
E               TypeError: got an unexpected keyword argument 'img_size'

Environment details

nvidia-modulus @ https://github.com/nbren12/modulus/archive/refs/heads/tisr.tar.gz#sha256=84f7044c4cb030d30aa1b4a55ca25af90fe7bdb45252c3e7738896274dcea609
NickGeneva commented 11 months ago

@nbren12 I can take this since this was on our radar. We'll need to get a revised checkpoint on NGC, was just expecting to have a few more weeks to do it. Should not happen again.

https://github.com/NVIDIA/modulus/pull/104

nbren12 commented 11 months ago

No rush. We can address stuff like this before the release. We should re-run all the notebooks/examples. There may be other small bugs.

From: Nicholas Geneva @.> Date: Thursday, October 19, 2023 at 3:15 PM To: NVIDIA/earth2mip @.> Cc: Noah Brenowitz @.>, Mention @.> Subject: Re: [NVIDIA/earth2mip] 🐛[BUG]: AFNO "fcn" broken in more recent versions of modulus (Issue #75)

@nbren12https://github.com/nbren12 I can take this since this was on our radar. We'll need to get a revised checkpoint on NGC, was just expecting to have a few more weeks to do it. Should not happen again.

NVIDIA/modulus#104https://github.com/NVIDIA/modulus/pull/104

— Reply to this email directly, view it on GitHubhttps://github.com/NVIDIA/earth2mip/issues/75#issuecomment-1771777678, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AAKSRETUY44J23VET5ZLZX3YAGQ7FAVCNFSM6AAAAAA6H4JHL6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTONZRG43TONRXHA. You are receiving this because you were mentioned.Message ID: @.***>