neuropoly / gitea

https://gitea.io fork with https://git-annex.branchable.com support
https://gitea.io
MIT License
3 stars 2 forks source link

git-annex: size miscalculation #32

Open kousu opened 1 year ago

kousu commented 1 year ago

It seems to be possible that the summary size on a repo (top right) can become out of sync with git-annex files. In this example, this one file is larger (6.6MB) than the repo thinks its entire size (3.3MB) is:

Screenshot 2022-11-30 at 00-52-07 test

The number seems to be generated -- and then cached -- by this function:

https://github.com/neuropoly/gitea/blob/fa43bce541507c8a702723f84764a48db9278506/modules/repository/create.go#L288-L301

So this is a cache invalidation problem: git annex copy --to is not triggering a recomputation like it should be.

kousu commented 1 year ago

Here are all the callers:

p115628@joplin:~/src/neurogitea/gitea$ git grep UpdateRepoSize
models/repo/update.go:// UpdateRepoSize updates the repository size, calculating it using util.GetDirectorySize
models/repo/update.go:func UpdateRepoSize(ctx context.Context, repoID, size int64) error {
modules/repository/create.go:// UpdateRepoSize updates the repository size, calculating it using util.GetDirectorySize
modules/repository/create.go:func UpdateRepoSize(ctx context.Context, repo *repo_model.Repository) error {
modules/repository/create.go:   return repo_model.UpdateRepoSize(ctx, repo.ID, size+lfsSize)
modules/repository/create.go:   if err = UpdateRepoSize(ctx, repo); err != nil {
modules/repository/generate.go: if err := UpdateRepoSize(ctx, generateRepo); err != nil {
modules/repository/repo.go:             if err = UpdateRepoSize(ctx, repo); err != nil {
routers/web/repo/view.go:               if err = repo_module.UpdateRepoSize(ctx, ctx.Repo.Repository); err != nil {
routers/web/repo/view.go:                       ctx.ServerError("UpdateRepoSize", err)
services/mirror/mirror_pull.go: if err := repo_module.UpdateRepoSize(ctx, m.Repo); err != nil {
services/repository/check.go:                   if err := repo_module.UpdateRepoSize(ctx, repo); err != nil {
services/repository/fork.go:    if err := repo_module.UpdateRepoSize(ctx, repo); err != nil {
services/repository/push.go:    if err = repo_module.UpdateRepoSize(ctx, repo); err != nil {

I notice that last line says 'push' so I took an educated guess and tried adding a commit:

``` p115628@joplin:~/src/neurogitea/test/test$ touch a p115628@joplin:~/src/neurogitea/test/test$ git add a p115628@joplin:~/src/neurogitea/test/test$ git commit -m "touch" [main 18bb9db] touch 1 file changed, 0 insertions(+), 0 deletions(-) create mode 100644 a p115628@joplin:~/src/neurogitea/test/test$ git push Locking support detected on remote "origin". Consider enabling it with: $ git config lfs.https://localhost/kousu/test.git/info/lfs.locksverify true Énumération des objets: 4, fait. Décompte des objets: 100% (4/4), fait. Compression par delta en utilisant jusqu'à 128 fils d'exécution Compression des objets: 100% (2/2), fait. Écriture des objets: 100% (3/3), 269 octets | 269.00 Kio/s, fait. Total 3 (delta 1), réutilisés 0 (delta 0), réutilisés du pack 0 remote: . Processing 1 references remote: Processed 1 references in total To localhost:kousu/test.git 10bed66..18bb9db main -> main ```

Now the size is fixed:

Screenshot 2022-11-30 at 01-03-25 test

so it really does seem to be a relatively simple cache-invalidation issue.

So what we want to do here, maybe, is add an UpdateRepoSize() call to the git-annex-shell code? Somewhere around here:

https://github.com/neuropoly/gitea/blob/0901a8cadf1da063c774465a888d6e019e60cfc5/cmd/serv.go#L322-L334

https://github.com/neuropoly/gitea/blob/0901a8cadf1da063c774465a888d6e019e60cfc5/cmd/serv.go#L380

kousu commented 1 year ago

In practice this bug is going to be pretty rare, since usually you must git annex sync or git push origin git-annex:git-annex after running git annex copy --to (or use git annex sync --content), otherwise the files are inaccessible, which will trigger the recomputation.

kousu commented 1 year ago

Heads up: this is getting significantly more complicated https://github.com/go-gitea/gitea/pull/22900/