koaning / uvtrick

A fun party trick to run Python code from another venv into this one.
MIT License
148 stars 5 forks source link

Imports outside of function body seem to be ignored, e.g. decorators and type hints #21

Closed mikeweltevrede closed 1 month ago

mikeweltevrede commented 1 month ago

Minimum reproducible example:

from typing import Optional, override

from uvtrick import Env

def no_externals(hello: str):
    return None

def with_typehint(hello: Optional[str]):
    return None

@override
def typing_extensions_decorator(hello: str):
    return None

When we use no externals, like decorators or type hints, the code works fine:

Env("pandas==1.4.2", python="3.9").run(no_externals, "world")
"""
Successful
"""

When we use a type hint (first example) or a decorator (second example), it fails on a NameError because those are not being imported in the temporary scripts:

Env("pandas==1.4.2", python="3.9").run(with_typehint, "world")
"""
Traceback (most recent call last):
  File "/var/folders/p2/cql_03s12nj3t3d56xb48vv80000gn/T/tmpb8ydeziz/pytemp.py", line 1, in <module>
    def dummy1(hello: Optional[str]):
NameError: name 'Optional' is not defined
Traceback (most recent call last):
  File "/Users/mikeweltevrede/Documents/Development/caap8-dspe-model-generator/temp.py", line 19, in <module>
    Env("pandas==1.4.2", python="3.9").run(dummy1, "world")
  File "/Users/mikeweltevrede/Documents/Development/caap8-dspe-model-generator/.venv/lib/python3.9/site-packages/uvtrick/__init__.py", line 125, in run
    with open(temp_pickle_path, 'rb') as file:
FileNotFoundError: [Errno 2] No such file or directory: '/var/folders/p2/cql_03s12nj3t3d56xb48vv80000gn/T/tmpb8ydeziz/tmp.pickle'
"""
Env("pandas==1.4.2", python="3.9").run(typing_extensions_decorator, "world")
"""
Traceback (most recent call last):
  File "/var/folders/p2/cql_03s12nj3t3d56xb48vv80000gn/T/tmpw3tgig7v/pytemp.py", line 1, in <module>
    @override
NameError: name 'override' is not defined
Traceback (most recent call last):
  File "/Users/mikeweltevrede/Documents/Development/caap8-dspe-model-generator/temp2.py", line 12, in <module>
    Env("pandas==1.4.2", "typing_extensions==4.0.0", python="3.9").run(dummy_typing_extensions_decorator, "world")
  File "/Users/mikeweltevrede/Documents/Development/caap8-dspe-model-generator/.venv/lib/python3.9/site-packages/uvtrick/__init__.py", line 125, in run
    with open(temp_pickle_path, 'rb') as file:
FileNotFoundError: [Errno 2] No such file or directory: '/var/folders/p2/cql_03s12nj3t3d56xb48vv80000gn/T/tmpw3tgig7v/tmp.pickle'
"""
koaning commented 1 month ago

Interesting! I could argue that this is somewhat out of scope for this project, it is meant as a hack/party trick more than anything else. But I am wondering what might be done to facilitate this. After all ... if a hack is worth doing ... it might be worth overdoing!

Perhaps to better understand the use-case, what is your goal with these decorators/imports? The reason it is not supported at the moment is because this project uses the inspect technique to fetch the body of the function as a string.

mikeweltevrede commented 1 month ago

Definitely agree :)

I have two use cases that your "party trick" (😉) will be applicable to:

  1. I have a class that checks whether the installed libraries match the ones that were installed when a certain ML model that we load in was trained. If not, throw an error. Especially here for unit testing, we don't want to rely on the libraries that are installed in the repo, because then the unit test either needs to be updated every time a dependency is updated or it needs to dynamically access the version (e.g. pd.__version__) which seems less practical than a self-contained unit test that defines the environment.
  2. In a similar vein as the above, I am running a script in which we need to compare an ML model in its training workflow to a previous or baseline model, which will have separate dependencies. I have tried several alternatives that do not work (most notably mlflow.evaluate) and this seemed like a nice option.

As per the question on "what is your goal with these decorators/imports", I am not sure if there is a reason not to have them. Both add functionality to a function with their own benefits. Type hinting with external libraries is useful in Python versions less than 3.10 (most common type hints are covered in 3.10+ as built-in, such as str | None instead of Optional[str], although e.g. Literal is not). Decorators I think speak for themselves. This would most commonly apply to lru_cache, for instance.

PS: Happy to contribute. I checked out your library after your video and quickly tried it out towards the end of the working day. Perhaps there are options within the inspect module but I'd have to deepdive :)

koaning commented 1 month ago

So about that lru_cache ...

def func(X, y):
    from sklearn.linear_model import LinearRegression
    return LinearRegression().fit(X, y)

Suppose that I cache this function and want to run it across environments ... shouldn't the dependencies and Python version be part of the input? Otherwise the function would always return the same thing, even if the Env object wants to call it from another uv environment.

The issue with having a library that is totally doing a hacky thing, is that there are also limits that one will hit very quickly. If you want to use this trick, I fear you would also need to write Python differently.

mikeweltevrede commented 1 month ago

Good point, I admit it is interesting to import within a more local scope. I have always been taught to (in general) import in the global scope. Of course, this is necessary for decorators and type hints so in that case, you cannot put them as part of the input.

It might be possible to dynamically get where the method/class comes from. If I have the following:

import pandas as pd
from sklearn.linear_model import LinearRegression

def func(X: pd.DataFrame, y: pd.Series) -> LinearRegression:
    return LinearRegression().fit(X, y)

then I could first check the local scope of the function for the import of LinearRegression (because it is called in the function). Secondly, from the type hints, I already know from the above that I have to search for pd and LinearRegression in a more global scope.

Conclusively: I agree that this trick shouldn't need to apply to every use case, at least not in the current stage. But if more people are using this, it might be an interesting new project that expands below its original "hacky" intent. Let me know in case I can assist when you do want to expand :)

koaning commented 1 month ago

I have always been taught to (in general) import in the global scope.

The art of proficiency is to learn the rules and then to learn when to ignore them ;).

If you come up with a relatively maintainable way to expand on this trick I am all ears. But I would also remember that this is a package that receives 10 downloads a day, so it may also be fine to invest in other ideas first.