jedbrown / git-fat

Simple way to handle fat files without committing them to git, supports synchronization using rsync
BSD 2-Clause "Simplified" License
622 stars 137 forks source link

How to clean up / garbage collect in remote fat-store directory #96

Closed jin-eld closed 4 years ago

jin-eld commented 4 years ago

Hi,

first of all, thanks for this great plugin, I've been using it for several years now and it works really great.

Recently I stumbled over one thing, I could not find the answer anywhere, I wonder if I am missing this or if it falls into the category of a feature request?

Let's say I have a git-fat repo and I decide to drop some commits (i.e. via an interactive rebase). git fat status will show me the hashes of the files marked as "Garbage objects" and git fat gc will clean those up locally. But what about the remote git fat store? I'd like to clear out those files there as well.

So far the only option I see is to save the output of git verify and manually remove those hashes on the remote server where the git-fat store directory is located after the local git fat gc finishes. Is it somehow possible to automatically garbage collect in the remote store directory as well?

Kind regards, Jin

jedbrown commented 4 years ago

This is hazardous in that there could be other clones that reference objects in the remote store. To clean up, I would gather all referenced objects (from all repos/clones making use of the store) and remove all the files that are not referenced. Unless someone has ideas about how to make this very reliable, I'm reluctant to add this feature (which can permanently destroy data) to git-fat itself.

jin-eld commented 4 years ago

Well, the analogy is a forced push, you're not supposed to do it, but sometimes if you have to, then everyone else needs to sync their clones. Actually, this is exactly what is happening in my example, commits are being dropped, then I have to git force push. I was hoping to be able to do the same with git fat.

Thanks for the clarification, so my approach doing this manually is actually the only way this can be done right now, I just wanted to be sure I did not miss something.

jedbrown commented 4 years ago

I consider it to be quite a bit stronger than a forced push because clones don't have all the objects (indeed, they're unlikely to have older objects). (Also, forced pushes don't necessarily delete the objects on the server; that happens at GC time, which may come quite a bit later and therefore provide opportunities for recovery.) There is no plan to add this, and I'd prefer to consider it out of scope until/unless a simple and reliable method is proposed.

jin-eld commented 4 years ago

I understand your concerns and I'm OK with the answer, am not that knowledgeable in git internals so I doubt I can suggest something in detail or come up with a patch.

I also got the info on how to work around my particular requirement for the clean up that I need to do and we also have it here for the sake of documentation, so it's fine. Thanks again for the info, I'm closing the request then.