src-d / go-git

Project has been moved to: https://github.com/go-git/go-git
https://github.com/go-git/go-git
Apache License 2.0
4.91k stars 542 forks source link

Support dynamic object directories #818

Open ZJvandeWeg opened 6 years ago

ZJvandeWeg commented 6 years ago

GIT_OBJECT_DIRECTORY and GIT_ALTERNATE_OBJECT_DIRECTORIES allow dynamic object directories for deduplication.

Alternates can be set in info/alternates, as is implemented in https://github.com/src-d/go-git/pull/663, however at times it would be nice to have a dynamic way of setting them, on a per Repository instance basis. For example, GitLab uses it for push rules.

The easiest way of achieving this would be an attribute on the Repository struct, downside will be that it would be in the a public interface. Code churn would probably be very limited however. Furthermore

The second option that can be pursued is, is extending the PlainOpenOptions. The advantage being that the change is nicely wrapped. Obviously, it can only be used with PlainOpenWithOptions.

Please note I'm unfamiliar with billy, so there might be an option I'm missing, all the more reason to open this issue first before I start building anything.

Edit: It seems like only adding it on one place, e.g. PlainOpen isn't an option, as it reroutes to the Open() constructor. So the Option struct doesn't seem like it's an option.. To not have an interface change, requiring a major version bump I would propose exposing it as a repository struct field.

smola commented 6 years ago

@ZJvandeWeg You can simulate GIT_OBJECT_DIRECTORY with go-billy:

See an example:

package main

import (
    "fmt"
    "os"

    "gopkg.in/src-d/go-billy.v4/helper/mount"
    "gopkg.in/src-d/go-billy.v4/helper/polyfill"
    "gopkg.in/src-d/go-billy.v4/memfs"
    "gopkg.in/src-d/go-billy.v4/osfs"
    "gopkg.in/src-d/go-git.v4"
    "gopkg.in/src-d/go-git.v4/plumbing/object"
    "gopkg.in/src-d/go-git.v4/storage/filesystem"
)

func main() {
    gitDir := os.Args[1]
    objDir := os.Args[2]

    gitFs := osfs.New(gitDir)
    objFs := osfs.New(objDir)
    fs := polyfill.New(mount.New(gitFs, "objects", objFs))

    s, err := filesystem.NewStorage(fs)
    if err != nil {
        panic(err)
    }

    r, err := git.Open(s, memfs.New())
    if err != nil {
        panic(err)
    }

    iter, err := r.BlobObjects()
    if err != nil {
        panic(err)
    }

    err = iter.ForEach(func(b *object.Blob) error {
        fmt.Println(b.Hash.String())
        return nil
    })
    if err != nil {
        panic(err)
    }

    iter.Close()
}
ZJvandeWeg commented 6 years ago

@smola I figured the alternates could take a similar approach, as having a []*billy.FileSystem should be enough for this use case. Would that be a patch that would be acceptable?

smola commented 6 years ago

@ZJvandeWeg First thing would be deciding if we want to implement alternates as a generalized storage concept or as something specific for the filesystem storage.

If it was general, we would probably implement a new interface storer.AlternateStorer and is implemented by storage to provide a way to return multiple alternate storer.EncodedObjectStorer. I'm not sure there's any appealing use case to expose this though.

If it's filesystem specific, then filesystem package would need a root filesystem from which to find alternate paths. Maybe it would require something like filesystem.NewStorageFromRoot(fs billy.Filesystem, path string), so that go-git can find in fs both the repository as well as the paths (absolute or relative) that alternates point to.