Open skshetry opened 1 year ago
hi @skshetry can you assign it to me?
as I understand the we need to remove the run()
command but if the tests are like this
stage = dvc.run(
cmd="python copy.py params.py metrics.py",
metrics_no_cache=["metrics.py"],
params=["params.py:INT,FLOAT,DICT,Train,Klass"],
name="copy-file",
)
and the return value is using somewhere else, how to handle it? should I replace the dvc.run
to dvc.repro
?
Hi @ArshErgon, sorry for the late reply.
dvc run
command no longer exists, and there is no 1:1 equivalent in dvc now. There is dvc stage add --run
on the command side, and dvc.stage.add()
in API side (with no run support).
repro
runs the whole DAG of stages, whereas run
only runs one stage that was created in memory.
One way would be to replace dvc.run()
with dvc.stage.add()
, but that won't work for all the tests. We'd need to replace them with some APIs that are closer to what's being tested and exists.
I don't want to be discouraging, but this task requires a deeper understanding of dvc's codebase, so this is definitely not a https://github.com/iterative/dvc/labels/good%20first%20issue.
@skshetry This is less user facing. What's your take on the priority?
I think we can downgrade it to p2, since we have a lot of p1s. But it is important, because a lot of tests depend on dvc.run()
which does not exist in real world.
We removed
dvc run
command in 3.0 and removed support for running.dvc
file. Butdvc.run()
still exists and lots of tests are still using it.We should remove the use of
dvc.run()
from tests and gradually get rid ofdvc.run()
API.