iterative / example-repos-dev

Source code and generator scripts for example DVC projects
https://dvc.org/doc
21 stars 13 forks source link

Various issues in `example-dvc-experiments` #98

Closed iesahin closed 2 years ago

iesahin commented 2 years ago

These are reported by @tapadipti (thanks). I'm moving here to discuss and follow:

I was running experiments by following the docs (https://dvc.org/doc/start/experiments) and encountered the following issues. Sharing here for any required action.

  1. dvc is not installed by pip install -r requirements.txt. So, if someone is trying to use a new virtual env, they need to install dvc separately. Would be good to include dvc in requirements.txt.

  2. dvc pull gave this error:

    ERROR: failed to pull data from the cloud - Checkout failed for following targets:
    models/model.h5
    metrics
    Is your cache up to date?
    <https://error.dvc.org/missing-files>
  3. dvc exp run lists all the image when running the extract stage. Would be good to remove -v from tar -xvzf data/images.tar.gz --directory data

  4. If you used dvc repro before section in the doc is a little unclear. Does dvc exp run replace dvc repro? If yes, can we state this clearly? Also would be great to change this statement We use dvc repro to run the pipeline... to dvc repro runs the pipeline...

jorgeorpinel commented 2 years ago

This seems high priority.

jorgeorpinel commented 2 years ago

We can remove bug and change to p1 after 2. is addressed at least, I think.

iesahin commented 2 years ago
  1. dvc is not installed by pip install -r requirements.txt. So, if someone is trying to use a new virtual env, they need to install dvc separately. Would be good to include dvc in requirements.txt.

This was a bit intentional to let the users install DVC themselves, and a bit to prevent version conflicts. There are some conditions (like installing DVC to system and venv both with different dependencies) that cause weird behavior.

We can go on to this route though, it's a single line of change. Is it better to add dvc to the requirements.txt @shcheklein?

tapadipti commented 2 years ago

If this was intentional and we don't want to include dvc in requirements.txt, then we should add an instruction that the user should install dvc. Currently, such an instruction is missing. It is unlikely that many people will reach the experiments page of the tutorial without first having installed dvc. But in case they try to work a new venv, it can be a `lil confusing.

iesahin commented 2 years ago

I remembered why I left -v in tar, it was taking some time after extract to start running and the experiment looks like it's frozen. I've now updated the project not to use -v in tar, and also updated model.h5 in the remote. (We had a bug in DVC that was preventing to upload experiments.) Could you now check whether the project works as intended? @tapadipti

I'll create separate PRs in the docs for content updates. Thank you.

tapadipti commented 2 years ago

Thanks @iesahin

dvc pull gave this error:

ERROR: failed to pull data from the cloud - Checkout failed for following targets:
/Users/tapadiptisitaula/Documents/test/example-dvc-experiments/models/model.h5
Is your cache up to date?
<https://error.dvc.org/missing-files>

So looks like metrics worked but not model.h5. And this time, the full file path is displayed.

Removing -v worked. The files are not listed anymore.

iesahin commented 2 years ago
> ERROR: failed to pull data from the cloud - Checkout failed for following targets:
/Users/tapadiptisitaula/Documents/test/example-dvc-experiments/models/model.h5

Interesting. I double checked yesterday that the script pushing the artifacts has completed successfully. Now, I've checked again and it says:

dvc push
Everything is up to date.

Could you check the MD5 line in dvc.lock, corresponding to this line: https://github.com/iterative/example-dvc-experiments/blob/main/dvc.lock#L36

What's the MD5 hash value there, in your installation?

iesahin commented 2 years ago

Also, I've checked after cloning the repository:

image

@tapadipti

iesahin commented 2 years ago

The current staging version in https://github.com/iterative/example-dvc-staging resolves all of these issues. I think we can push it to example-dvc-experiments.

shcheklein commented 2 years ago

@iesahin sounds good.

iesahin commented 2 years ago

The most recent https://github.com/iterative/example-dvc-experiments resolves all these issues. The codification changes are in #97. Closing this.