Open guysmoilov opened 3 years ago
Hi @guysmoilov
I looked at the git part of this problem. There are two parts:
a) If you have not yet committed the file yet, then a simple git restore --staged <file>
will do.
b) But if you want to untrack a file that has already been tracked and committed, then it's tricky because doing git rm --cached
will remove the file from others' systems (locally) when they do a git pull
(You also have to list the file in .gitignore
). If we do git update-index --assume-unchanged
, then it won't show the file in unstaged changes, but I think it continues to remain in the repo.
@indweller Thanks for the research!
Yes, making git forget a committed file is daly next to impossible for a distributed repo.
As the first line in the issue suggests, I think we should focus on git add
and dvc add
- fds forget
is IMO much easier to remember than git restore --staged <file>
and also should handle removing the file from DVC tracking.
Ok so for the git part it can do git restore
and the for the DVC part it can do dvc remove
(https://dvc.org/doc/user-guide/how-to/stop-tracking-data). Can I work on this issue?
@indweller I think you also need to run some form of dvc gc
after dvc remove
.
And sure, thank you!
Interesting potentially relevant project: https://rtyley.github.io/bfg-repo-cleaner/
Scenario: You accidentally
git add
'ed ordvc add
'ed a path that you didn't intend to.It's a commonly googled question: https://stackoverflow.com/questions/1274057/how-to-make-git-forget-about-a-file-that-was-tracked-but-is-now-in-gitignore
What
fds forget
can add:.dvc
file if it exists, and also make git forget about that file