iterative / dvc

🦉 ML Experiments and Data Management with Git
https://dvc.org
Apache License 2.0
13.36k stars 1.16k forks source link

dvc exp apply - fails #10437

Open marioperezj opened 1 month ago

marioperezj commented 1 month ago

Hi, I just created a new dvc repo. I destroy my previous repo and init a new one. Then I simply create a new pipeline and run an experiment then when trying to apply the experiment I hit this issue.

dvc exp apply -vv  test  
2024-05-21 17:22:22,313 DEBUG: v3.50.0 (pip), CPython 3.10.12 on Linux-5.10.16.3-microsoft-standard-WSL2-x86_64-with-glibc2.35                                                                                                                                                 
2024-05-21 17:22:22,313 DEBUG: command: /home/eamrerp/.local/bin/dvc exp apply -vv test                                                                                                                                                                                        
2024-05-21 17:22:22,314 TRACE: Namespace(quiet=0, verbose=2, cprofile=False, cprofile_dump=None, yappi=False, yappi_separate_threads=False, viztracer=False, viztracer_depth=None, viztracer_async=False, pdb=False, instrument=False, instrument_open=False, show_stack=False, cd='.', cmd='apply', force=True, experiment='test', func=<class 'dvc.commands.experiments.apply.CmdExperimentsApply'>, parser=DvcParser(prog='dvc', usage=None, description='Data Version Control', formatter_class=<class 'dvc.cli.formatter.RawTextHelpFormatter'>, conflict_handler='error', add_help=False))                                                                                                                                                                                                                                             
2024-05-21 17:22:22,612 ERROR: unexpected error - [Errno 5] Input/output error                                                                                                                                                                                                 
Traceback (most recent call last):                                                                                                                                                                                                                                               
File "/home/eamrerp/.local/lib/python3.10/site-packages/dvc/cli/__init__.py", line 211, in main                                                                                                                                                                                  ret = cmd.do_run()                                                                                                                                                                                                                                                           
File "/home/eamrerp/.local/lib/python3.10/site-packages/dvc/cli/command.py", line 27, in do_run                                                                                                                                                                                  return self.run()                                                                                                                                                                                                                                                            
File "/home/eamrerp/.local/lib/python3.10/site-packages/dvc/commands/experiments/apply.py", line 19, in run                                                                                                                                                                      self.repo.experiments.apply(self.args.experiment)                                                                                                                                                                                                                            
File "/home/eamrerp/.local/lib/python3.10/site-packages/dvc/repo/experiments/__init__.py", line 334, in apply                                                                                                                                                                    return apply(self.repo, *args, **kwargs)                                                                                                                                                                                                                                     
File "/home/eamrerp/.local/lib/python3.10/site-packages/dvc/repo/__init__.py", line 57, in wrapper                                                                                                                                                                               with lock_repo(repo):                                                                                                                                                                                                                                                        
File "/usr/lib/python3.10/contextlib.py", line 135, in __enter__                                                                                                                                                                                                                 return next(self.gen)                                                                                                                                                                                                                                                        
File "/home/eamrerp/.local/lib/python3.10/site-packages/dvc/repo/__init__.py", line 45, in lock_repo                                                                                                                                                                             with repo.lock:                                                                                                                                                                                                                                                              
File "/home/eamrerp/.local/lib/python3.10/site-packages/dvc/lock.py", line 137, in __enter__                                                                                                                                                                                     self.lock()                                                                                                                                                                                                                                                                  
File "/home/eamrerp/.local/lib/python3.10/site-packages/dvc/lock.py", line 119, in lock                                                                                                                                                                                          lock_retry()                                                                                                                                                                                                                                                                 
File "/home/eamrerp/.local/lib/python3.10/site-packages/funcy/decorators.py", line 47, in wrapper                                                                                                                                                                                return deco(call, *dargs, **dkwargs)                                                                                                                                                                                                                                         
File "/home/eamrerp/.local/lib/python3.10/site-packages/funcy/flow.py", line 99, in retry                                                                                                                                                                                        return call()                                                                                                                                                                                                                                                                
File "/home/eamrerp/.local/lib/python3.10/site-packages/funcy/decorators.py", line 68, in __call__                                                                                                                                                                               return self._func(*self._args, **self._kwargs)                                                                                                                                                                                                                               
File "/home/eamrerp/.local/lib/python3.10/site-packages/dvc/lock.py", line 110, in _do_lock                                                                                                                                                                                      self._lock = zc.lockfile.LockFile(self._lockfile)                                                                                                                                                                                                                            
File "/home/eamrerp/.local/lib/python3.10/site-packages/zc/lockfile/__init__.py", line 120, in __init__                                                                                                                                                                          super().__init__(path)                                                                                                                                                                                                                                                       File "/home/eamrerp/.local/lib/python3.10/site-packages/zc/lockfile/__init__.py", line 100, in __init__                                                                                                                                                                          self._on_lock()                                                                                                                                                                                                                                                              File "/home/eamrerp/.local/lib/python3.10/site-packages/zc/lockfile/__init__.py", line 128, in _on_lock                                                                                                                                                                          self._fp.truncate()                                                                                                                                                                                                                                                        
OSError: [Errno 5] Input/output error                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         
2024-05-21 17:22:22,690 DEBUG: link type reflink is not available ([Errno 95] no more link types left to try out)                                                                                                                                                              
2024-05-21 17:22:22,690 DEBUG: Removing '/mnt/d/repo/.pIjNm0rEPNQ8WUL3XAUPsA.tmp'                                                                                                                                                                                              
2024-05-21 17:22:22,697 DEBUG: Removing '/mnt/d/repo/.pIjNm0rEPNQ8WUL3XAUPsA.tmp'                                                                                                                                                                                              
2024-05-21 17:22:22,700 DEBUG: Removing '/mnt/d/repo/.pIjNm0rEPNQ8WUL3XAUPsA.tmp'                                                                                                                                                                                              
2024-05-21 17:22:22,703 DEBUG: Removing '/mnt/d/repo/common-repo/.dvc/cache/files/md5/.Q0W3MKEM_1tixXZIxgT5Gg.tmp'                                                                                                                                                             
2024-05-21 17:22:22,719 DEBUG: Version info for developers:                                                                                                                                                                                                                    
DVC version: 3.50.0 (pip)                                                                                                                                                                                                                                                      -------------------------                                                                                                                                                                                                                                                      
Platform: Python 3.10.12 on Linux-5.10.16.3-microsoft-standard-WSL2-x86_64-with-glibc2.35                                                                                                                                                                                      
Subprojects:                                                                                                                                                                                                                                                                           
dvc_data = 3.15.1                                                                                                                                                                                                                                                              
dvc_objects = 5.1.0                                                                                                                                                                                                                                                            
dvc_render = 1.0.2                                                                                                                                                                                                                                                             
dvc_task = 0.4.0                                                                                                                                                                                                                                                               
scmrepo = 3.3.1                                                                                                                                                                                                                                                        
Supports:                                                                                                                                                                                                                                                                              
http (aiohttp = 3.9.3, aiohttp-retry = 2.8.3),                                                                                                                                                                                                                                 
https (aiohttp = 3.9.3, aiohttp-retry = 2.8.3)                                                                                                                                                                                                                         
Config:                                                                                                                                                                                                                                                                                
Global: /home/eamrerp/.config/dvc                                                                                                                                                                                                                                              
System: /etc/xdg/dvc                                                                                                                                                                                                                                                   
Cache types: hardlink, symlink                                                                                                                                                                                                                                                 
Cache directory: 9p on drvfs                                                                                                                                                                                                                                                   
Caches: local                                                                                                                                                                                                                                                                  
Remotes: None                                                                                                                                                                                                                                                                  
Workspace directory: 9p on drvfs                                                                                                                                                                                                                                               
Repo: dvc, git                                                                                                                                                                                                                                                                 
Repo.site_cache_dir: /var/tmp/dvc/repo/80dc6aaf08a6c608dfe4ff9c3c907a02   
dberenbaum commented 1 month ago

Maybe something is corrupted from the old repo? Could you try to destroy again and make sure everything is gone? Could you also try destroying the site cache dir (/var/tmp/dvc/repo/80dc6aaf08a6c608dfe4ff9c3c907a02)?

marioperezj commented 1 month ago

Thank you for the response. I think my general issue is that I don't know very well how the cache works in dvc. I undestand the repository cache is stored in .dvc inside the repository, but other cache levels are also active and they have some files that are corrupted.

I do dvc destroy inside the repository. Then I do dvc init. followed by dvc exp show and I still see some experiments from previous commits in the master and workspace. How is it possible I have an experiment instance in the workspace if I don't even have stages defined? How to completely start a clean dvc instance?

For instance, in a newly dvc instance (right after dvc destroy) trying to do dvc gc -A outputs:

PS D:\repo\new_repo> dvc gc -A
WARNING: This will remove all cache except items used in the workspace and all git commits of the current repo.
Are you sure you want to proceed? [y/n]: y
WARNING: Some of the cache files do not exist neither locally nor on remote. Missing cache files:
name: None, md5: 6d4d0548ad88fd32d3fd9f364db422df.dir
Missing cache for directory '/datasets'. Cache for files inside will be lost. Would you like to continue? Use '-f' to force. [y/n] y
WARNING: Some of the cache files do not exist neither locally nor on remote. Missing cache files:
name: None, md5: 55a3fdcf02b802bf4eb2d7b0eaf166c1.dir
Missing cache for directory '/models'. Cache for files inside will be lost. Would you like to continue? Use '-f' to force. [y/n] y
WARNING: Some of the cache files do not exist neither locally nor on remote. Missing cache files:
name: None, md5: 407217359088a8c817c69fb4fae7f30f.dir
Missing cache for directory '/evaluations'. Cache for files inside will be lost. Would you like to continue? Use '-f' to force. [y/n] y
WARNING: Some of the cache files do not exist neither locally nor on remote. Missing cache files:
name: None, md5: 798d782e087169db9e90de53d29686de.dir
Missing cache for directory '/original_models'. Cache for files inside will be lost. Would you like to continue? Use '-f' to force. [y/n] y
WARNING: Some of the cache files do not exist neither locally nor on remote. Missing cache files:
name: None, md5: fed68f66bd0db0a079d7405a00771906.dir
Missing cache for directory '/datasets'. Cache for files inside will be lost. Would you like to continue? Use '-f' to force. [y/n] y
WARNING: Some of the cache files do not exist neither locally nor on remote. Missing cache files:
name: None, md5: 3572310ba721c824bdf2647005c2265b.dir
Missing cache for directory '/models'. Cache for files inside will be lost. Would you like to continue? Use '-f' to force. [y/n] y
WARNING: Some of the cache files do not exist neither locally nor on remote. Missing cache files:
name: None, md5: 407217359088a8c817c69fb4fae7f30f.dir
Missing cache for directory '/evaluations'. Cache for files inside will be lost. Would you like to continue? Use '-f' to force. [y/n] y
WARNING: Some of the cache files do not exist neither locally nor on remote. Missing cache files:
name: None, md5: 1ad87cbce6f391d05b1ce4b40a550a57.dir
Missing cache for directory '/original_models'. Cache for files inside will be lost. Would you like to continue? Use '-f' to force. [y/n] y
WARNING: Some of the cache files do not exist neither locally nor on remote. Missing cache files:
name: None, md5: fed68f66bd0db0a079d7405a00771906.dir
Missing cache for directory '/datasets'. Cache for files inside will be lost. Would you like to continue? Use '-f' to force. [y/n] y
WARNING: Some of the cache files do not exist neither locally nor on remote. Missing cache files:
name: None, md5: 3572310ba721c824bdf2647005c2265b.dir
Missing cache for directory '/models'. Cache for files inside will be lost. Would you like to continue? Use '-f' to force. [y/n] y
WARNING: Some of the cache files do not exist neither locally nor on remote. Missing cache files:
name: None, md5: 407217359088a8c817c69fb4fae7f30f.dir
Missing cache for directory '/evaluations'. Cache for files inside will be lost. Would you like to continue? Use '-f' to force. [y/n] y
WARNING: Some of the cache files do not exist neither locally nor on remote. Missing cache files:
name: None, md5: 1ad87cbce6f391d05b1ce4b40a550a57.dir
Missing cache for directory '/original_models'. Cache for files inside will be lost. Would you like to continue? Use '-f' to force. [y/n] y
WARNING: Some of the cache files do not exist neither locally nor on remote. Missing cache files:
name: None, md5: fed68f66bd0db0a079d7405a00771906.dir
Missing cache for directory '/datasets'. Cache for files inside will be lost. Would you like to continue? Use '-f' to force. [y/n] y
WARNING: Some of the cache files do not exist neither locally nor on remote. Missing cache files:
name: None, md5: 3572310ba721c824bdf2647005c2265b.dir
Missing cache for directory '/models'. Cache for files inside will be lost. Would you like to continue? Use '-f' to force. [y/n] y
WARNING: Some of the cache files do not exist neither locally nor on remote. Missing cache files:
name: None, md5: 407217359088a8c817c69fb4fae7f30f.dir
Missing cache for directory '/evaluations'. Cache for files inside will be lost. Would you like to continue? Use '-f' to force. [y/n] y
WARNING: Some of the cache files do not exist neither locally nor on remote. Missing cache files:
name: None, md5: 1ad87cbce6f391d05b1ce4b40a550a57.dir
Missing cache for directory '/original_models'. Cache for files inside will be lost. Would you like to continue? Use '-f' to force. [y/n] y
WARNING: Some of the cache files do not exist neither locally nor on remote. Missing cache files:
name: None, md5: 6632f42841f6355685e089b62a85c45c.dir
Missing cache for directory '/models'. Cache for files inside will be lost. Would you like to continue? Use '-f' to force. [y/n] y
WARNING: Output 'evaluations'(stage: '..\..\dvc.yaml:evaluation') is missing version info. Cache for it will not be collected. Use `dvc repro` to get your pipeline up to date.
WARNING: Some of the cache files do not exist neither locally nor on remote. Missing cache files:
name: None, md5: 1ad87cbce6f391d05b1ce4b40a550a57.dir
Missing cache for directory '/original_models'. Cache for files inside will be lost. Would you like to continue? Use '-f' to force. [y/n] y
WARNING: Some of the cache files do not exist neither locally nor on remote. Missing cache files:
name: None, md5: 6632f42841f6355685e089b62a85c45c.dir
Missing cache for directory '/models'. Cache for files inside will be lost. Would you like to continue? Use '-f' to force. [y/n] y
WARNING: Some of the cache files do not exist neither locally nor on remote. Missing cache files:
name: None, md5: 1ad87cbce6f391d05b1ce4b40a550a57.dir
Missing cache for directory '/original_models'. Cache for files inside will be lost. Would you like to continue? Use '-f' to force. [y/n] y
WARNING: Some of the cache files do not exist neither locally nor on remote. Missing cache files:
name: None, md5: 6632f42841f6355685e089b62a85c45c.dir
Missing cache for directory '/models'. Cache for files inside will be lost. Would you like to continue? Use '-f' to force. [y/n] y
WARNING: Some of the cache files do not exist neither locally nor on remote. Missing cache files:
name: None, md5: 1ad87cbce6f391d05b1ce4b40a550a57.dir
Missing cache for directory '/original_models'. Cache for files inside will be lost. Would you like to continue? Use '-f' to force. [y/n] y
WARNING: Some of the cache files do not exist neither locally nor on remote. Missing cache files:
name: None, md5: 9726d31307db79567a1f1a863a8b9423.dir
Missing cache for directory '/models'. Cache for files inside will be lost. Would you like to continue? Use '-f' to force. [y/n] y
WARNING: Some of the cache files do not exist neither locally nor on remote. Missing cache files:
name: None, md5: 837f40add133f220032f2d43fd8f91d1.dir
Missing cache for directory '/original_models'. Cache for files inside will be lost. Would you like to continue? Use '-f' to force. [y/n] y
WARNING: Some of the cache files do not exist neither locally nor on remote. Missing cache files:
name: None, md5: 938afceb0aa2859532678f14ede61b79.dir
Missing cache for directory '/big_dataset_only_remove_empty'. Cache for files inside will be lost. Would you like to continue? Use '-f' to force. [y/n] y
WARNING: Some of the cache files do not exist neither locally nor on remote. Missing cache files:
name: None, md5: fc0c116d205a776d1fd78796446df6e2.dir
Missing cache for directory '/custom-tokenizer-from-old'. Cache for files inside will be lost. Would you like to continue? Use '-f' to force. [y/n] y
WARNING: Some of the cache files do not exist neither locally nor on remote. Missing cache files:
name: None, md5: 5f7cf0e6c325c2eeecc9d4458fdd4d3b.dir
Missing cache for directory '/data'. Cache for files inside will be lost. Would you like to continue? Use '-f' to force. [y/n] y
WARNING: Some of the cache files do not exist neither locally nor on remote. Missing cache files:
name: None, md5: 17d6c38d7f28838eafb2217e16747aeb.dir
Missing cache for directory '/.\models\distillbert-big-dataset-baseline\training'. Cache for files inside will be lost. Would you like to continue? Use '-f' to force. [y/n] y
WARNING: Some of the cache files do not exist neither locally nor on remote. Missing cache files:
name: None, md5: e309eb997ca05aa54fd654e5f76d4c5a.dir
Missing cache for directory '/.\models\distillbert-big-dataset-baseline\evaluation'. Cache for files inside will be lost. Would you like to continue? Use '-f' to force. [y/n] y
WARNING: Some of the cache files do not exist neither locally nor on remote. Missing cache files:
name: None, md5: 5fc384024f96196f95d2bd0f068da7a3.dir
Missing cache for directory '/raw_data'. Cache for files inside will be lost. Would you like to continue? Use '-f' to force. [y/n] y
ERROR: Failed to collect '2ce912ccc3fbb9b8bdf2f9331d72640b4b859c4a': unable to read: '..\..\dvc.lock', YAML file structure is corrupted: while scanning a simple key
in "<unicode string>", line 31, column 1
could not find expected ':'
in "<unicode string>", line 32, column 10
dberenbaum commented 1 month ago

Thanks for the clarifications.

I do dvc destroy inside the repository. Then I do dvc init. followed by dvc exp show and I still see some experiments from previous commits in the master and workspace. How is it possible I have an experiment instance in the workspace if I don't even have stages defined?

Ah, that's happening because these experiments are stored in git, not in dvc itself. Git stores all of its info in .git/, and DVC stores all of its info in .dvc/. dvc destroy will delete .dvc/ along with other DVC-related files, but it won't touch .git/, which is where experiment references are stored.

tldr you can drop those with dvc exp remove -A.

For instance, in a newly dvc instance (right after dvc destroy) trying to do dvc gc -A outputs:

dvc gc -A checks your entire Git history. You have previous commits in you Git history that still have dvc.yaml pipelines or .dvc files, so the warnings are related to those previous commits.

Also, note that dvc gc -A does not delete objects from all commits. It does the opposite, trying to delete only objects that are not associated with any commit (for example, objects generated by experiments). DVC is warning you that objects it expects to be there are not because you already destroyed all of the old cache during dvc destroy.