src-d / hercules

Gaining advanced insights from Git repository history.
Other
2.63k stars 334 forks source link

index out of range sometimes when burndown.PeopleNumber is provided #237

Open bobheadxi opened 5 years ago

bobheadxi commented 5 years ago

I might just be holding it wrong, but on the repo https://github.com/ubclaunchpad/inertia.git, the following code:

    return g.pipe.DeployItem(&leaves.BurndownAnalysis{
        TrackFiles:  true,
        Granularity: 30,
        Sampling:    30,

        PeopleNumber: 10,
    }).(hercules.LeafPipelineItem)

panics when run with:

g.pipe.Initialize(nil)
g.pipe.Run(commits)

stacktrace:

goroutine 7 [running]:
testing.tRunner.func1(0xc000160200)
        /usr/local/Cellar/go/1.12/libexec/src/testing/testing.go:830 +0x388
panic(0x48f6c20, 0x568a160)
        /usr/local/Cellar/go/1.12/libexec/src/runtime/panic.go:522 +0x1b5
gopkg.in/src-d/hercules.v9/leaves.(*BurndownAnalysis).updateAuthor(0xc000228ff0, 0x4c000, 0x4c000, 0xe)
        /Users/robertlin/go/pkg/mod/gopkg.in/src-d/hercules.v9@v9.1.0/leaves/burndown.go:1122 +0x1be
gopkg.in/src-d/hercules.v9/internal/burndown.(*File).updateTime(0xc00189a2c0, 0x4c000, 0x4c000, 0xe)
        /Users/robertlin/go/pkg/mod/gopkg.in/src-d/hercules.v9@v9.1.0/internal/burndown/file.go:53 +0x91
gopkg.in/src-d/hercules.v9/internal/burndown.NewFile(0x4c000, 0xe, 0xc0009de380, 0xc00189a280, 0x4, 0x4, 0x2)
        /Users/robertlin/go/pkg/mod/gopkg.in/src-d/hercules.v9@v9.1.0/internal/burndown/file.go:67 +0xce
gopkg.in/src-d/hercules.v9/leaves.(*BurndownAnalysis).newFile(0xc000228ff0, 0xade27e51688d33a1, 0x95e501b21412e16e, 0xc3bbb28c, 0xc000b43bf0, 0xa, 0x13, 0x0, 0xe, 0xc00183f3b8, ...)
        /Users/robertlin/go/pkg/mod/gopkg.in/src-d/hercules.v9@v9.1.0/leaves/burndown.go:1178 +0xf5
gopkg.in/src-d/hercules.v9/leaves.(*BurndownAnalysis).handleInsertion(0xc000228ff0, 0xc000d66500, 0x13, 0xc00449b980, 0xc0001051d8, 0x9)
        /Users/robertlin/go/pkg/mod/gopkg.in/src-d/hercules.v9@v9.1.0/leaves/burndown.go:1198 +0x189
gopkg.in/src-d/hercules.v9/leaves.(*BurndownAnalysis).Consume(0xc000228ff0, 0xc00449b200, 0x569f980, 0x9, 0xc0001051d8)
        /Users/robertlin/go/pkg/mod/gopkg.in/src-d/hercules.v9@v9.1.0/leaves/burndown.go:363 +0x3f8
gopkg.in/src-d/hercules.v9/internal/core.(*Pipeline).Run(0xc00111d900, 0xc001112000, 0x4a8, 0x500, 0x0, 0x0, 0x0)
        /Users/robertlin/go/pkg/mod/gopkg.in/src-d/hercules.v9@v9.1.0/internal/core/pipeline.go:777 +0xfd4
github.com/bobheadxi/projector/analysis.(*GitRepoAnalyser).exec(0xc0033cc3f0, 0x4cd58c0, 0xc000172690, 0x5c29288)
        /Users/robertlin/go/src/github.com/bobheadxi/projector/analysis/git.go:102 +0xca
github.com/bobheadxi/projector/analysis.(*GitRepoAnalyser).Analyze(0xc0033cc3f0, 0xc000a33050, 0xc0033cc3f0, 0xc0000c10e0)
        /Users/robertlin/go/src/github.com/bobheadxi/projector/analysis/git.go:77 +0x170
github.com/bobheadxi/projector/analysis.TestGitRepoAnalyser(0xc000160200)
        /Users/robertlin/go/src/github.com/bobheadxi/projector/analysis/integration_test.go:29 +0x29c
testing.tRunner(0xc000160200, 0x4b1ba98)
        /usr/local/Cellar/go/1.12/libexec/src/testing/testing.go:865 +0xc0
created by testing.(*T).Run
        /usr/local/Cellar/go/1.12/libexec/src/testing/testing.go:916 +0x357
FAIL    github.com/bobheadxi/projector/analysis 27.512s

running a similar configuration directly using hercules (I think) works:

hercules --burndown --burndown-people --burndown-files https://github.com/ubclaunchpad/inertia.git

i noticed that the --burndown-people option sets the number of people based on facts instead, which I guess runs a separate analysis to get an accurate number of people

emulating this behaviour by removing PeopleNumber from my call to DeployItem and instead providing ConfigBurndownTrackPeople=true in facts works:

    g.pipe.Initialize(map[string]interface{}{
        leaves.ConfigBurndownTrackPeople: true,
    })
        g.pipe.Run(commits)
=== RUN   TestGitRepoAnalyser
--- PASS: TestGitRepoAnalyser (88.38s)
    logger.go:114: 2019-03-12T11:43:17.293-0700 INFO    git-manager     cloning repository      {"remote": "https://github.com/ubclaunchpad/inertia.git", "dir": "tmp/ubclaunchpad/inertia", "auth": false}
    logger.go:114: 2019-03-12T11:43:58.099-0700 INFO    git-manager     repo cloned     {"remote": "https://github.com/ubclaunchpad/inertia.git"}
PASS
ok      github.com/bobheadxi/projector/analysis 88.501s

My suspicion is this happens because the repository https://github.com/ubclaunchpad/inertia has 21 contributors, while the repo I was previously using for testing (as in #235), https://github.com/bobheadxi/calories, only has 4 contributors, so the PeopleNumber: 10 worked for the latter but not the former. I feel the expected behaviour here might be to either to ignore discovered authors after the first PeopleNumber authors. alternatively:

I'd be happy to put in a PR to fix up the documentation or anything after clarification. thanks!

vmarkovtsev commented 5 years ago

This is some legacy side effect, thank you for thorough testing. I think the best way to fix this is to remove PeopleNumber and make reversedPeopleDict public. Then all the places which check PeopleNumber should be changed to check len(ReversedPeopleDict) instead.