src-d / go-git

Project has been moved to: https://github.com/go-git/go-git
https://github.com/go-git/go-git
Apache License 2.0
4.91k stars 541 forks source link

Fetch reports "already up-to-date" when trying to fetch greater depth on a shallow clone #1250

Open lucastheisen opened 4 years ago

lucastheisen commented 4 years ago

I have code that operates on a shallow clone and needs to progressively fetch greater depth until it finds the change it is looking for. Basically this:

func (repo *git.Repository) findVersionFrom(hash plumbing.Hash, baseParser func(string) (string, error)) (string, error) {
    // should probably use git rev-list instad of log
    //   https://github.com/src-d/go-git/issues/757
    entries, err := repo.Log(&git.LogOptions{From: hash})
    if err != nil {
        return "", fmt.Errorf("unable to read log: %v", err)
    }

    versionFile := "version.txt"
    current := ""
    depth := 0
    versionChanged := errors.New("")
    parent := plumbing.ZeroHash

    err = entries.ForEach(func(c *object.Commit) error {
        if len(c.ParentHashes) > 0 {
            parent = c.ParentHashes[0]
        }

        tree, err := c.Tree()
        if err != nil {
            return fmt.Errorf("cannot retrive tree for %v: %v", c, err)
        }

        file, err := tree.File(versionFile)
        if err != nil {
            return fmt.Errorf("unable to retrieve %s: %v", versionFile, err)
        }

        versionContent, err := file.Contents()
        if err != nil {
            return fmt.Errorf("unable to read %s: %v", versionFile, err)
        }

        versionBase, err := baseParser(versionContent)
        if err != nil {
            return fmt.Errorf("failed to parse version from [%s]: %v", versionContent, err)
        }

        if current == "" {
            current = versionBase
            depth = 0
            return nil
        }

        if versionBase != current {
            return versionChanged
        }

        depth = depth + 1

        return nil
    })
    if err == plumbing.ErrObjectNotFound {
        depth := 50
        err = repo.Fetch(&git.FetchOptions{Depth: depth})
        if err != nil {
            return "", fmt.Errorf("failed to fetch %d, parent %v: %v", depth, parent, err)
        }
    } else if err != nil && err != versionChanged {
        return "", err
    }

    return fmt.Sprintf("%s.%d", current, depth), nil
}

I walked through a lot of the fetch code and came across this code:

    req.Wants, err = getWants(r.s, refs)
    if len(req.Wants) > 0 {
        req.Haves, err = getHaves(localRefs, remoteRefs, r.s)
        if err != nil {
            return nil, err
        }

        if err = r.fetchPack(ctx, o, s, req); err != nil {
            return nil, err
        }
    }

It appears the r.fetchPack only occurs if getWants returns more than one. But getWants only seems to check the hash at the tip of each branch:

    wants := map[plumbing.Hash]bool{}
    for _, ref := range refs {
        hash := ref.Hash()
        exists, err := objectExists(localStorer, ref.Hash())
        if err != nil {
            return nil, err
        }

        if !exists {
            wants[hash] = true
        }
    }

So since its a shallow clone it has the same tip as the remote and no wants are detected.