Closed alekseik1 closed 1 month ago
Use absolute imports for
--queue
to work properly.
This is the opposite of what --queue
(and dvc in general) expects. dvc is built to work relative to your repo. Your dvc.yaml
should look like this:
stages:
main:
cmd: python main.py
deps:
- main.py
- dep.py
That should fix your problem since dvc will make a copy of the repo in a temp directory to run the queue and have copies of the dependencies relative to that temp directory, unlike now where you are always reading from those absolute paths, making the temp directory pointless.
Thanks for reply! Indeed, I changed paths to relative and now everything works as expected. I still have problems with my original setup though.
My original setup uses poetry and dvc - perhaps the problem is that I use cmd: poetry run python my_script.py
and poetry uses files from "local package" rather than from file system (since I did not pass --no-root
option). And these "local package" files are taken from workspace (symlinks, maybe?).
Gonna do some more digging and come back.
@dberenbaum I found a quite subtle bug when using imports and poetry - that seems to be the root of the problem. The setup is quite complicated so I pushed it to a small repo here with steps to reproduce.
It seems like dvc
updates PYTHONPATH
in a way that does not match package-level import like import my_package.,my_module as m
but works fine with import my_module as m
.
Could you please check out this repo and see if this bug persists on your machine?
This looks like the expected behavior. import my_package.my_module
is loading the package from its installed location. import my_module
is loading the local package relative to your current directory.
To work as you expect, you would need to run poetry install
inside your dvc pipeline to install all current local packages as part of the experiment.
Bug Report
Issue name
exp run --queue: expriment runs with workspace version of code, ignoring changes made in experiment.
Description
When using imports with
dvc exp run --queue
, the file being imported is always on the workspace version regardless of its state when running experiment.Reproduce
dvc init
.main.py
anddep.py
main.py:
dep.py:
Use absolute imports for
--queue
to work properly.dvc repro
, you'll seemain
printed to stdout, that's ok.my_str = 'queue'
todep.py
dvc exp run --name 'bug' --queue
my_str = 'main'
back indep.py
.git status
says thatdep.py
is not changed.dvc queue start
main
printed to stdout, though you created experiment with"queue"
in dep.py.Expected
Expected stdout to be "queue", not "main".
Environment information
Output of
dvc doctor
:Additional Information (if any):