ContextLab / davos

Import packages in Python, even if they aren't installed!
MIT License
34 stars 8 forks source link

Per notebook requirements.txt #96

Open yarikoptic opened 1 year ago

yarikoptic commented 1 year ago

Is davos would be something to help us to setup per notebook venv? e.g. in https://github.com/dandi/example-notebooks/ we have some notebooks with specific requirements.txt:

❯ find -iname requirements.txt
./000055/BruntonLab/peterson21/requirements.txt
./000006/DataJoint/DJ-NWB-Economo-2018/requirements.txt
./000009/DataJoint/DJ-NWB-Guo-Inagaki-2017/requirements.txt
./000010/DataJoint/DJ-NWB-Li-Daie-2015-2016/requirements.txt
./000013/DataJoint/DJ-NWB-Hires-Gutnisky-2015/requirements.txt

and it would be great if we could just use smth like

import davos
davos.configure(requirements='requirements.txt')

and then davos would do the magic to ensure that all followup imports would happen from the environment which was configured according to that requirements.txt. Is that possible?

backref on issue which inspired the question: https://github.com/dandi/handbook/pull/95

paxtonfitzpatrick commented 1 year ago

@yarikoptic yes this is possible! But it works slightly differently from how you described -- the idea is that davos replaces your requirements.txt and manages the virtual environment itself, so the notebook can be run and/or shared with someone else with no setup or configuration required before running it.

E.g., if the requirements.txt for the notebook's environment contains:

numpy==1.22.0
pandas==2.1.0
requests==2.27.1

you could get rid of that file entirely and replace your import statements in the notebook with:

%pip install davos
import davos
smuggle numpy as np     # pip: numpy==1.22.0
smuggle pandas as pd    # pip: pandas==2.1.0
smuggle requests        # pip: requests==2.27.1

When the notebook is run, davos will automatically install the required packages in a notebook-specific environment (if the user doesn't have them already) and load them from there. These notebook-specific environments that davos creates and manages are called "projects", and live in $HOME/.davos/projects/. By default, each environment will be named after the absolute path to the notebook that uses it, in order to make them notebook-specific. However, if you have multiple notebooks that use the same set of requirements, you can have them share an environment by setting:

davos.project = "name-of-shared-project"

in each notebook, before your smuggle statements.

yarikoptic commented 1 year ago

that is what I was hoping to avoid -- to need modifying import statements ;-)

jeremymanning commented 1 year ago

i bet "someone" could write a script to:

jeremymanning commented 1 year ago

I did something like this for the neuromatch materials-- e.g. here's a script for going through a bunch of folders and adding an "install davos" cell (plus some other stuff...but you could lightly tweak it to do what you want): https://github.com/ContextLab/course-content/blob/main/chatify/process_notebooks.py

yarikoptic commented 1 year ago
  • loop through each cell and replace all instances of ' import ' with ' smuggle ' and i suspect it'd work pretty well!

but that is what I wondered to avoid by just some top level magic to e.g. shim the import functionality in such a way that it would smuggle the corresponding version. This way the rest of the script would remain as is - pure regular python. FWIW, here we did some similar evil shimming of import for the purpose of collating what to cite: https://github.com/duecredit/duecredit/blob/8df619316dc0281e1257ac1628190ff9b41c27b1/duecredit/injections/injector.py#L64

jeremymanning commented 1 year ago

we were generally uncomfortable with the idea of "replacing" or "co-opting" an existing keyword, hence the current design. but in principle i agree that it'd be neat to retain the same code in some circumstances. we chose to prioritize making it explicit when things were being run through davos vs. when the "original" keyword was being used.