libgit2 / git2go

Git to Go; bindings for libgit2. Like McDonald's but tastier.
MIT License
1.94k stars 315 forks source link

Library constantly increases memory usage in long-running applications #959

Open pPrecel opened 5 months ago

pPrecel commented 5 months ago

Description:

I observed in one long-running application that if the application uses the git2go library to clone the repo or compute the last commit hash from branch/tag then memory is constantly increasing. We run our application in Kubernetes so it leads the pod to out of memory error.

I prepared the hello world application that can easily reproduce the problem. The application clones the repo in the infinite loop and after that, it will remove the tmp directory and run the Free method on every git2go object:

package main

import (
    "fmt"
    "os"

    git2go "github.com/libgit2/git2go/v34"
)

func main() {
    fmt.Println("starting...")

    iter := 1
    for {
        fmt.Printf("iteration %d...\n", iter)

        err := fetchRepo("https://github.com/kyma-project/serverless")
        if err != nil {
            fmt.Printf("WARN: %s\n", err.Error())
        }

        iter++
    }

}

const (
    branchRefPattern = "refs/remotes/origin"
)

func fetchRepo(repoUrl string) error {
    // create tmp dir
    repoDir := "/tmp/git2go_test_"
    err := os.MkdirAll(repoDir, 0700)
    if err != nil {
        return err
    }
    defer os.RemoveAll(repoDir)

    // init repo structure
    repo, err := git2go.InitRepository(repoDir, true)
    if err != nil {
        return err
    }
    defer repo.Free()

    // create/reuse remote
    remote, err := lookupCreateRemote(repo, repoUrl)
    if err != nil {
        return err
    }
    defer remote.Free()

    // fetch remote
    err = remote.Fetch(nil,
        &git2go.FetchOptions{
            DownloadTags: git2go.DownloadTagsAll,
        }, "")
    if err != nil {
        return err
    }

    return nil
}

// lookupCreateRemote looks up the remote with the given name, if it doesn't exist it creates it
func lookupCreateRemote(repo *git2go.Repository, url string) (*git2go.Remote, error) {
    remote, err := repo.Remotes.Lookup("origin")
    if err == nil {
        return remote, nil
    }

    return repo.Remotes.Create("origin", url)
}

The app can be built by running docker build -t <tag> . on this Dockerfile (or you can simply reuse my image pprecel/git2go:latest):

FROM golang:1.22.3-alpine3.20 as builder

WORKDIR /app

RUN apk add --no-cache gcc libc-dev
RUN apk add --no-cache --repository http://dl-cdn.alpinelinux.org/alpine/v3.18/community libgit2-dev=1.5.2-r0

COPY . /app

RUN pwd

RUN go mod tidy
RUN CGO_ENABLED=1 GOOS=linux GOARCH=amd64 go build -o /app/main /app/main.go

FROM alpine:latest

RUN apk add --no-cache --repository http://dl-cdn.alpinelinux.org/alpine/v3.18/community libgit2-dev=1.5.2-r0

COPY --from=builder /app /app

CMD ["/app/main"]

The application can be run in every container ecosystem, so docker or Kubernetes systems will show the right results:

docker run -d --name git2go-2 pprecel/git2go:latest

or

kubectl run git2go --image=pprecel/git2go:latest

On my machine, I observed that after a night the memory usage increased from 28Mi to almost 3000Mi and it's still increasing. . Example:

kubectl top pods
NAME                       CPU(cores)   MEMORY(bytes)
git2go                       532m            20Mi

after a night

kubectl top pods
NAME                       CPU(cores)   MEMORY(bytes)
git2go                       532m            3012Mi
pPrecel commented 5 months ago

The output from the ContainerWatch extension:

Screenshot 2024-06-10 at 09 33 05