rstudio / vetiver-python

Version, share, deploy, and monitor models.
https://rstudio.github.io/vetiver-python/stable/
MIT License
60 stars 18 forks source link

Error importing vetiver.handlers when torch not installed #67

Closed machow closed 2 years ago

machow commented 2 years ago

Describe the bug

Because vetiver's TorchHandler class tries to set base_class = torch.nn.Module when it is defined, it cannot be imported without torch. Since it is imported in vetiver.handlers, not having torch causes an error on import.

To Reproduce

From vetiver repo

python -m venv env
source env/bin/activate

pip install .

from python

import vetiver.handlers

output:

>>> import vetiver.handlers
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/machow/repos/vetiver-python/vetiver/__init__.py", line 9, in <module>
    from .vetiver_model import VetiverModel
  File "/Users/machow/repos/vetiver-python/vetiver/vetiver_model.py", line 3, in <module>
    from vetiver.handlers._interface import create_handler
  File "/Users/machow/repos/vetiver-python/vetiver/handlers/_interface.py", line 1, in <module>
    from vetiver.handlers import torch, sklearn, base
  File "/Users/machow/repos/vetiver-python/vetiver/handlers/torch.py", line 13, in <module>
    class TorchHandler(VetiverHandler):
  File "/Users/machow/repos/vetiver-python/vetiver/handlers/torch.py", line 22, in TorchHandler
    base_class = torch.nn.Module
NameError: name 'torch' is not defined
machow commented 2 years ago

This is my bad, for suggesting the base_class = approach 😬. If you could set base_class = <actual model OR a dummy one> that should do it. Or maybe some other test than using isinstance in the handler?

machow commented 2 years ago

There are probably better ways to do this--but in case it's useful, here are a couple possible workarounds I mentioned while pairing....!

The crux of the challenge is that to register with a singledispatch generic, we need to reference the class being dispatched on (e.g. torch.nn.Module).

from functools import singledispatch
from vetiver.handlers.torch import TorchHandler

@singledispatch
def create_handler(model, ptype_data):
    raise NotImplementedError()

# base_class is a torch.nn.Module
create_handler.register(TorchHandler.base_class, TorchHandler)

However, if we don't have torch installed, then we'll fail to import TorchHandler, because it will try to set base_class = torch.nn.Module when it's defined.

It seems like there are 3 potential fixes:

  1. (easiest?): use a try/except for the importing of the handler, rather than inside the handler file
  2. define a register(cls, dispatcher) classmethod on handlers, so they can register themselves
  3. define each handler as a namespace package, with an entrypoint

Approach 1: move try/except outward

handlers/torch.py:

# note: no try except in this file anymore
import torch

class TorchHandler:
    base_class = torch.nn.Module

handlers/_interface.py:

try:
    from vetiver.handlers.torch import TorchHandler
    create_handler.register(TorchHandler.base_class, TorchHandler)
except ImportError:
    pass

# additional try/except for sklearn

# additional try/except for xgboost

Alternatively, you could write a function to do the try/except

import importlib

def try_register(dispatcher, handler_module_name:str):
    try:
        mod = importlib.import_module(handler_module_name)

        # note: each handler module would need to define a variable named e.g. handler,
        # that is its handler class, so this function doesn't need to know the class's name
        dispatcher.register(mod.handler.base_class, mod.handler)
    except ImportError:
        pass

try_register(create_handler, "vetiver.handlers.torch")

Approach 2: define a register method on handler

For this approach, each handler's module makes sure it can be imported, even if the dep doesn't exist. The added piece is the handler is responsible for registering itself on create_handler (double dispatch).

handlers/torch.py

torch_exists=True
try:
    import torch
except:
    torch_exists=False

class TorchHandler:
    # no base_class, so won't error if torch isn't installed

    @classmethod
    def register(cls, dispatcher):
        if torch_exists:
            dispatcher.register(torch.nn.Module, cls)

handlers/_interface.py

from torch import TorchHandler

TorchHandler.register(create_handler)
has2k1 commented 2 years ago

This issue is the result of juggling 2 things.

  1. Class definition that should be aware of a potentially unavailable type
  2. Object instantiation that depends on the type of the argument

No. 2, hints at using singledispatch, though this use of singledispatch is interesting because the class already knows about the type on which it specialises. So when every class is created, it has to be registered/associated with the type that it already knows about. This extra bookkeeping can be avoided, though the solution is metaprogramming!

For this to be worth it the payoff in usability should outweigh the complication of introducing metaprogramming. It should be a few lines of code, but maybe too much magic!

has2k1 commented 2 years ago

@isabelizimm, here is the development of the metaprogramming solution.

machow commented 2 years ago

I think your piece may be tackling a second problem: automating registering with create_handler on class creation.

So if I understand, it looks like the metaprogramming proposal:

In this sense, I think it's close to the second option I mentioned (define a register method on handler), with the added piece of handling the registration automatically!

One quick thought is that I think you could get away with using __init_subclass__() on the base VetiverHandler, to dodge some of the intensity that metaclasses bring.

has2k1 commented 2 years ago

One quick thought is that I think you could get away with using __init_subclass__() on the base VetiverHandler, to dodge some of the intensity that metaclasses bring.

Perfect, on the money. I had never used it. I have updated the gist.

isabelizimm commented 2 years ago

Oh wow, thank you both SO MUCH for this very thorough explanation of how to fix this! 😮 This gist is so handy on understanding how everything works (and when it doesn't). I don't mind taking this one up after the monitoring is finished, unless someone is feeling very passionate about it!