neuropoly / gitea

https://gitea.io fork with https://git-annex.branchable.com support
https://gitea.io
MIT License
3 stars 2 forks source link

git-annex: make missing annex files 404 #40

Open kousu opened 1 year ago

kousu commented 1 year ago

Currently, if you try to view (#22) an annexed file that hasn't been uploaded, you get a 500:

Screenshot 2023-03-06 at 13-33-58 spine-generic-processed2

with an error like:

2023/03/06 13:31:34 ...ers/web/repo/view.go:393:renderFile() [E] getFileReader: in /home/GRAMES.POLYMTL.CA/p115628/src/neurogitea/gitea/data/gitea-repositories/datasets/spine-generic-processed2.git: SHA256E-s46315943--946ae8988bcb0ac9be373d0ea998b5eb91929761bab6ee68d955a7bb48793e4b.nii.gz does not seem to be a valid annexed file: exit status 1
2023/03/06 13:31:34 router: completed GET /datasets/spine-generic-processed2/src/branch/master/sub-amu04/anat/sub-amu04_T1w.nii.gz for [::1]:44488, 500 Internal Server Error in 88.2ms @ repo/view.go:762(repo.Home)

This should probably be a 404. Or a fallback to rendering the annex pointer.

kousu commented 1 year ago

Actually, for comparison, this is how the LFS code behaves too:

I made an LFS repo, then used the Repo Settings interface to erase one of the LFS files:

Screenshot 2023-03-06 at 21-39-28 lfs-test

Trying to view the file in gitea gave a 500:

Screenshot 2023-03-06 at 21-39-06 lfs-test

with a similar error to the annex case:

2023/03/06 21:39:21 ...ers/web/repo/view.go:393:renderFile() [E] [6406a3d9] getFileReader: LFS Meta object does not exist
2023/03/06 21:39:21 [6406a3d9] router: completed GET /kousu/lfs-test/src/branch/main/what.nii.gz for [::1]:44170, 500 Internal Server Error in 24.0ms @ repo/view.go:762(repo.Home)

So maybe this isn't a real problem.


But the difference with the annex case is that it's a lot easier with git-annex to forget to upload files.

With git lfs, every git push uploads all files that the remote can't know about by looking at what git diff --stat origin/${branch}..branch (or equivalent) and looking for LFS files in there. So the only way to trigger a 500 with LFS is 1. erasing and re-uploading a repo to the same URL (which you can easily fix with git lfs push --all) 2. explicitly deleting a LFS file like I just did.

With git-annex files are not uploaded with git push and it explicitly supports git annex drop --from=origin as a feature, whereas git-lfs assumes you're holding onto your files forever unless you drop the commits that reference them. The LFS approach makes more sense, but we have to deal with the annex approach. So we should add some 404, and maybe a special message like "this file is annexed but its content is missing from this repository".