src-d / hercules

Gaining advanced insights from Git repository history.
Other
2.63k stars 334 forks source link

Hercules crashes with language filter #270

Closed senden9 closed 5 years ago

senden9 commented 5 years ago

Hi!

I tried hercules on a quite big private repository of our company. It crashes when I use the language filter. If we look at the CLI output below we see that the first variant without the --languages csharp runs and generate valid (and interesting, thanks!) data. When I add the language filter it crashes after some time.

$ ./hercules.linux_amd64 --burndown --granularity 2 --sampling 2 --pb --commits develop_stretching_hashes repo-cache > baukasten_burndown_stretching.pb
./hercules.linux_amd64 --burndown --granularity 2 --sampling 2 --pb --commits  8785,81s user 60,45s system 103% cpu 2:22:28,83 total

$ ./hercules.linux_amd64 --burndown --granularity 2 --sampling 2 --pb --commits develop_stretching_hashes repo-cache --languages csharp > baukasten_burndown_stretching_csharp_only.pb
finalizing...2019/04/07 20:23:49 Failed to run the pipeline on [git@github.com:ORG/REPO.git]
panic: empty history

goroutine 1 [running]:
gopkg.in/src-d/hercules.v10/leaves.(*BurndownAnalysis).groupSparseHistory(0xc002248400, 0xc002eddf50, 0xffffffffffffffff, 0xc02947b458, 0x88043c, 0xc000000180, 0x300000002)
    /home/travis/gopath/src/gopkg.in/src-d/hercules.v10/leaves/burndown.go:1472 +0x703
gopkg.in/src-d/hercules.v10/leaves.(*BurndownAnalysis).Finalize(0xc002248400, 0x15d2540, 0xc0000dc100)
    /home/travis/gopath/src/gopkg.in/src-d/hercules.v10/leaves/burndown.go:504 +0x55
gopkg.in/src-d/hercules.v10/internal/core.(*Pipeline).Run(0xc02947bcc8, 0xc003736000, 0x12b8, 0x1400, 0x0, 0x0, 0x0)
    /home/travis/gopath/src/gopkg.in/src-d/hercules.v10/internal/core/pipeline.go:866 +0x714
main.glob..func3(0x21d3ec0, 0xc000a6a0b0, 0x1, 0xb)
    /home/travis/gopath/src/gopkg.in/src-d/hercules.v10/cmd/hercules/root.go:270 +0x85b
gopkg.in/src-d/hercules.v10/vendor/github.com/spf13/cobra.(*Command).execute(0x21d3ec0, 0xc0000cc010, 0xb, 0xb, 0x21d3ec0, 0xc0000cc010)
    /home/travis/gopath/src/gopkg.in/src-d/hercules.v10/vendor/github.com/spf13/cobra/command.go:766 +0x2ae
gopkg.in/src-d/hercules.v10/vendor/github.com/spf13/cobra.(*Command).ExecuteC(0x21d3ec0, 0xc000a225e0, 0xc00016ff88, 0x84f65f)
    /home/travis/gopath/src/gopkg.in/src-d/hercules.v10/vendor/github.com/spf13/cobra/command.go:852 +0x2ec
gopkg.in/src-d/hercules.v10/vendor/github.com/spf13/cobra.(*Command).Execute(...)
    /home/travis/gopath/src/gopkg.in/src-d/hercules.v10/vendor/github.com/spf13/cobra/command.go:800
main.main()
    /home/travis/gopath/src/gopkg.in/src-d/hercules.v10/cmd/hercules/root.go:510 +0x32
./hercules.linux_amd64 --burndown --granularity 2 --sampling 2 --pb --commits  854,87s user 46,41s system 108% cpu 13:51,67 total

As I can not give anyone access to the repository to reproduce this error feel free to close this issue if you want. I will try in the meantime to reduce the commandline flags used to produce the crash. This can take some time because the crash happend after 1/4 hour. And yes, there are C# files :D.

Edit: More crashes

./hercules.linux_amd64 --burndown --granularity 2 --sampling 2 --pb repo-cache --languages csharp > baukasten_burndown_csharp_only.pb                                               
finalizing...2019/04/07 20:38:33 Failed to run the pipeline on [git@github.com:ORG/REPO.git]
panic: empty history

goroutine 1 [running]:
gopkg.in/src-d/hercules.v10/leaves.(*BurndownAnalysis).groupSparseHistory(0xc0087c0f00, 0xc002542cf0, 0xffffffffffffffff, 0xc01b679458, 0x88043c, 0xc000000180, 0x300000002)
    /home/travis/gopath/src/gopkg.in/src-d/hercules.v10/leaves/burndown.go:1472 +0x703
gopkg.in/src-d/hercules.v10/leaves.(*BurndownAnalysis).Finalize(0xc0087c0f00, 0x15d2540, 0xc0000da300)
    /home/travis/gopath/src/gopkg.in/src-d/hercules.v10/leaves/burndown.go:504 +0x55
gopkg.in/src-d/hercules.v10/internal/core.(*Pipeline).Run(0xc01b679cc8, 0xc002bb4000, 0x8b9, 0x900, 0x0, 0x0, 0x0)
    /home/travis/gopath/src/gopkg.in/src-d/hercules.v10/internal/core/pipeline.go:866 +0x714
main.glob..func3(0x21d3ec0, 0xc000a623f0, 0x1, 0x9)
    /home/travis/gopath/src/gopkg.in/src-d/hercules.v10/cmd/hercules/root.go:270 +0x85b
gopkg.in/src-d/hercules.v10/vendor/github.com/spf13/cobra.(*Command).execute(0x21d3ec0, 0xc0000cc010, 0x9, 0x9, 0x21d3ec0, 0xc0000cc010)
    /home/travis/gopath/src/gopkg.in/src-d/hercules.v10/vendor/github.com/spf13/cobra/command.go:766 +0x2ae
gopkg.in/src-d/hercules.v10/vendor/github.com/spf13/cobra.(*Command).ExecuteC(0x21d3ec0, 0xc000a26620, 0xc00016ff88, 0x84f65f)
    /home/travis/gopath/src/gopkg.in/src-d/hercules.v10/vendor/github.com/spf13/cobra/command.go:852 +0x2ec
gopkg.in/src-d/hercules.v10/vendor/github.com/spf13/cobra.(*Command).Execute(...)
    /home/travis/gopath/src/gopkg.in/src-d/hercules.v10/vendor/github.com/spf13/cobra/command.go:800
main.main()
    /home/travis/gopath/src/gopkg.in/src-d/hercules.v10/cmd/hercules/root.go:510 +0x32
./hercules.linux_amd64 --burndown --granularity 2 --sampling 2 --pb repo-cach  475,91s user 19,12s system 105% cpu 7:49,19 total

$ ./hercules.linux_amd64 --burndown --pb repo-cache --languages csharp > baukasten_burndown_csharp_only.pb                             
finalizing...2019/04/07 20:46:45 Failed to run the pipeline on [git@github.com:ORG/REPO.git]
panic: empty history

goroutine 1 [running]:
gopkg.in/src-d/hercules.v10/leaves.(*BurndownAnalysis).groupSparseHistory(0xc00de68e00, 0xc00329cba0, 0xffffffffffffffff, 0xc007ee7458, 0x88043c, 0xc000000180, 0x300000002)
    /home/travis/gopath/src/gopkg.in/src-d/hercules.v10/leaves/burndown.go:1472 +0x703
gopkg.in/src-d/hercules.v10/leaves.(*BurndownAnalysis).Finalize(0xc00de68e00, 0x15d2540, 0xc0000b2300)
    /home/travis/gopath/src/gopkg.in/src-d/hercules.v10/leaves/burndown.go:504 +0x55
gopkg.in/src-d/hercules.v10/internal/core.(*Pipeline).Run(0xc007ee7cc8, 0xc002144000, 0x8b9, 0x900, 0x0, 0x0, 0x0)
    /home/travis/gopath/src/gopkg.in/src-d/hercules.v10/internal/core/pipeline.go:866 +0x714
main.glob..func3(0x21d3ec0, 0xc000a3a550, 0x1, 0x5)
    /home/travis/gopath/src/gopkg.in/src-d/hercules.v10/cmd/hercules/root.go:270 +0x85b
gopkg.in/src-d/hercules.v10/vendor/github.com/spf13/cobra.(*Command).execute(0x21d3ec0, 0xc0000321f0, 0x5, 0x5, 0x21d3ec0, 0xc0000321f0)
    /home/travis/gopath/src/gopkg.in/src-d/hercules.v10/vendor/github.com/spf13/cobra/command.go:766 +0x2ae
gopkg.in/src-d/hercules.v10/vendor/github.com/spf13/cobra.(*Command).ExecuteC(0x21d3ec0, 0xc000a1e600, 0xc00013ff88, 0x84f65f)
    /home/travis/gopath/src/gopkg.in/src-d/hercules.v10/vendor/github.com/spf13/cobra/command.go:852 +0x2ec
gopkg.in/src-d/hercules.v10/vendor/github.com/spf13/cobra.(*Command).Execute(...)
    /home/travis/gopath/src/gopkg.in/src-d/hercules.v10/vendor/github.com/spf13/cobra/command.go:800
main.main()
    /home/travis/gopath/src/gopkg.in/src-d/hercules.v10/cmd/hercules/root.go:510 +0x32

$ ./hercules.linux_amd64 --burndown --languages csharp repo-cache
finalizing...2019/04/07 20:50:36 Failed to run the pipeline on [git@github.com:ORG/REPO.git]
panic: empty history

goroutine 1 [running]:
gopkg.in/src-d/hercules.v10/leaves.(*BurndownAnalysis).groupSparseHistory(0xc0000c1a00, 0xc00245ba40, 0xffffffffffffffff, 0xc0131e7458, 0x88043c, 0xc000000180, 0x300000002)
    /home/travis/gopath/src/gopkg.in/src-d/hercules.v10/leaves/burndown.go:1472 +0x703
gopkg.in/src-d/hercules.v10/leaves.(*BurndownAnalysis).Finalize(0xc0000c1a00, 0x15d2540, 0xc0000c0100)
    /home/travis/gopath/src/gopkg.in/src-d/hercules.v10/leaves/burndown.go:504 +0x55
gopkg.in/src-d/hercules.v10/internal/core.(*Pipeline).Run(0xc0131e7cc8, 0xc0027f6000, 0x8b9, 0x900, 0x0, 0x0, 0x0)
    /home/travis/gopath/src/gopkg.in/src-d/hercules.v10/internal/core/pipeline.go:866 +0x714
main.glob..func3(0x21d3ec0, 0xc000a3e200, 0x1, 0x4)
    /home/travis/gopath/src/gopkg.in/src-d/hercules.v10/cmd/hercules/root.go:270 +0x85b
gopkg.in/src-d/hercules.v10/vendor/github.com/spf13/cobra.(*Command).execute(0x21d3ec0, 0xc000032060, 0x4, 0x4, 0x21d3ec0, 0xc000032060)
    /home/travis/gopath/src/gopkg.in/src-d/hercules.v10/vendor/github.com/spf13/cobra/command.go:766 +0x2ae
gopkg.in/src-d/hercules.v10/vendor/github.com/spf13/cobra.(*Command).ExecuteC(0x21d3ec0, 0xc000233690, 0xc00014df88, 0x84f65f)
    /home/travis/gopath/src/gopkg.in/src-d/hercules.v10/vendor/github.com/spf13/cobra/command.go:852 +0x2ec
gopkg.in/src-d/hercules.v10/vendor/github.com/spf13/cobra.(*Command).Execute(...)
    /home/travis/gopath/src/gopkg.in/src-d/hercules.v10/vendor/github.com/spf13/cobra/command.go:800
main.main()
    /home/travis/gopath/src/gopkg.in/src-d/hercules.v10/cmd/hercules/root.go:510 +0x32

Versions

 $ pip freeze | rg labours
labours==10.0.1

 $ ./hercules.linux_amd64 version
Version: 10
Git:     b856b666909194669e93d2c8dd5f86a96d9f60dc
vmarkovtsev commented 5 years ago

Yes, the language filter is not super solid right now, I noticed similar problems in the past myself. Should be fixed next week :construction_worker_man:

aldanor commented 5 years ago

Just to confirm, same here with --languages python:

$ ../hercules.linux_amd64 --burndown --burndown-people --languages python . > ../out.yaml

It works till the very end and then crashes with 'empty history' (it's definitely not empty...):

finalizing...2019/04/08 12:06:22 Failed to run the pipeline on [...]
panic: empty history

goroutine 1 [running]:
gopkg.in/src-d/hercules.v10/leaves.(*BurndownAnalysis).groupSparseHistory(0xc000b0ee00, 0xc00247eb40, 0xffffffffffffffff, 0xc000048070, 0xc000048000, 0x0, 0x91d511)
        /home/travis/gopath/src/gopkg.in/src-d/hercules.v10/leaves/burndown.go:1472 +0x715
gopkg.in/src-d/hercules.v10/leaves.(*BurndownAnalysis).Finalize(0xc000b0ee00, 0x1535660, 0xc000b0e000)
        /home/travis/gopath/src/gopkg.in/src-d/hercules.v10/leaves/burndown.go:504 +0x55
gopkg.in/src-d/hercules.v10/internal/core.(*Pipeline).Run(0xc001a11c20, 0xc0024da000, 0x1bb5, 0x1c00, 0x0, 0x0, 0x0)
        /home/travis/gopath/src/gopkg.in/src-d/hercules.v10/internal/core/pipeline.go:852 +0x6cf
main.glob..func3(0x20e7d80, 0xc000b10550, 0x1, 0x5)
        /home/travis/gopath/src/gopkg.in/src-d/hercules.v10/cmd/hercules/root.go:270 +0x843
gopkg.in/src-d/hercules.v10/vendor/github.com/spf13/cobra.(*Command).execute(0x20e7d80, 0xc0000301f0, 0x5, 0x5, 0x20e7d80, 0xc0000301f0)
        /home/travis/gopath/src/gopkg.in/src-d/hercules.v10/vendor/github.com/spf13/cobra/command.go:766 +0x2cc
gopkg.in/src-d/hercules.v10/vendor/github.com/spf13/cobra.(*Command).ExecuteC(0x20e7d80, 0xc000ae65c0, 0xc000ae6590, 0xc000b0e600)
        /home/travis/gopath/src/gopkg.in/src-d/hercules.v10/vendor/github.com/spf13/cobra/command.go:852 +0x2fd
gopkg.in/src-d/hercules.v10/vendor/github.com/spf13/cobra.(*Command).Execute(0x20e7d80, 0x84dd40, 0xc0000440b8)
        /home/travis/gopath/src/gopkg.in/src-d/hercules.v10/vendor/github.com/spf13/cobra/command.go:800 +0x2b
main.main()
        /home/travis/gopath/src/gopkg.in/src-d/hercules.v10/cmd/hercules/root.go:510 +0x31
vmarkovtsev commented 5 years ago

The original error should be fixed - please ping here if not.

However, I think it may still fail sometimes on some repos. I will refer to this issue once somebody reproduces.

aldanor commented 5 years ago

@vmarkovtsev Thanks, that seemed to fix the original problem, but now there's more :/

The one I've been encountering with the private repo I've been testing it on is:

 7773 / 8380 [==============================================>---] 14s [3e2dc13] 
[INFO] 2019/04/09 10:56:13 ====TREE====
0 67010
9 -1
[ERROR] 2019/04/09 10:56:13 Burndown failed on commit #6810 (7773) 3e2dc13a31759eb034463f856a846c9348d01bd1: <...>.py: internal integrity error src 0 != 9 e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 -> 1957261748983dedf1947b5688fc61b037e3f3b1
[ERROR] 2019/04/09 10:56:13 Failed to run the pipeline on [<...>.git]
2019/04/09 10:56:13 failed to run the pipeline: <...>.py: internal integrity error src 0 != 9 e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 -> 1957261748983dedf1947b5688fc61b037e3f3b1

This error doesn't seem to occur without setting the language filter.

Any ideas why this could be happening?

aldanor commented 5 years ago

Here's another reproducible (probably unrelated to the previous one):

$ git clone https://github.com/tensorflow/tensorflow.git
$ cd tensorflow
$ hercules.linux_amd64 --burndown --burndown-people --languages python .
 2 / 68788 [>----------------------------------------------] 1h54m46s [f41959c]
[ERROR] 2019/04/09 11:46:29 TreeDiff failed on commit #1 (2) f41959ccb2d9d4c722fe8fc3351401d53bcf4900: EOF
[ERROR] 2019/04/09 11:46:29 Failed to run the pipeline on [https://github.com/tensorflow/tensorflow.git]
2019/04/09 11:46:29 failed to run the pipeline: EOF

Doesn't seem to occur if language filter is removed.

Same crash with other reasonably big repos I've tested it on, like https://github.com/pytest-dev/pytest.git.

vmarkovtsev commented 5 years ago

This is the other one, exactly. Thanks, I will fix it.

aldanor commented 5 years ago

Just tested it on a small repo:

$ mkdir foo; cd foo; git init
$ touch foo.py; git add foo.py; git commit -m 1
$ echo foo > foo.py; git add foo.py; git commit -m 2
$ hercules.linux_amd64 --burndown --burndown-people --languages-python .
 0 / 5 [-----------------------------------------------------------------------]
[ERROR] 2019/04/09 12:36:21 TreeDiff failed on commit #1 (2) 4b5b10d421d173787db9cb757881889f05d77e84: EOF
2019/04/09 12:36:21 failed to run the pipeline: EOF

Looks like it doesn't like empty files (or commits with those).

Re: the "integrity error" with the private repo I've mentioned above - in that commit it was actually kind of the same story -- a Python file was modified which had previously been empty (but existed in the repo), the integrity error crash was at the commit where it was modified.

senden9 commented 5 years ago

Confirm also as not fixed. But thanks for the fast reaction time :D

❯ ./hercules.linux_amd64 --burndown --granularity 2 --sampling 2 --pb --commits develop_stretching_hashes repo-cache --languages csharp > baukasten_burndown_stretching_csharp_only.pb
finalizing...[ERROR] 2019/04/09 16:43:29 Failed to run the pipeline on [git@github.com:ORG/REPO.git]
panic: empty history

goroutine 1 [running]:
gopkg.in/src-d/hercules.v10/leaves.(*BurndownAnalysis).groupSparseHistory(0xc00ed2f0e0, 0xc0018c41e0, 0xffffffffffffffff, 0xc02745d440, 0x880f6c, 0xc000000180, 0x300000002)
    /home/travis/gopath/src/gopkg.in/src-d/hercules.v10/leaves/burndown.go:1495 +0x703
gopkg.in/src-d/hercules.v10/leaves.(*BurndownAnalysis).Finalize(0xc00ed2f0e0, 0x15db080, 0xc0000d2480)
    /home/travis/gopath/src/gopkg.in/src-d/hercules.v10/leaves/burndown.go:515 +0x55
gopkg.in/src-d/hercules.v10/internal/core.(*Pipeline).Run(0xc02745dcb8, 0xc003ace000, 0x12b8, 0x1400, 0x0, 0x0, 0x0)
    /home/travis/gopath/src/gopkg.in/src-d/hercules.v10/internal/core/pipeline.go:889 +0x726
main.glob..func3(0x21dfec0, 0xc000a840b0, 0x1, 0xb)
    /home/travis/gopath/src/gopkg.in/src-d/hercules.v10/cmd/hercules/root.go:270 +0xa42
gopkg.in/src-d/hercules.v10/vendor/github.com/spf13/cobra.(*Command).execute(0x21dfec0, 0xc0000cc010, 0xb, 0xb, 0x21dfec0, 0xc0000cc010)
    /home/travis/gopath/src/gopkg.in/src-d/hercules.v10/vendor/github.com/spf13/cobra/command.go:766 +0x2ae
gopkg.in/src-d/hercules.v10/vendor/github.com/spf13/cobra.(*Command).ExecuteC(0x21dfec0, 0xc000a2c620, 0xc00016ff88, 0x85018f)
    /home/travis/gopath/src/gopkg.in/src-d/hercules.v10/vendor/github.com/spf13/cobra/command.go:852 +0x2ec
gopkg.in/src-d/hercules.v10/vendor/github.com/spf13/cobra.(*Command).Execute(...)
    /home/travis/gopath/src/gopkg.in/src-d/hercules.v10/vendor/github.com/spf13/cobra/command.go:800
main.main()
    /home/travis/gopath/src/gopkg.in/src-d/hercules.v10/cmd/hercules/root.go:510 +0x32
./hercules.linux_amd64 --burndown --granularity 2 --sampling 2 --pb --commits  884,79s user 50,11s system 107% cpu 14:31,22 total

❯ ./hercules.linux_amd64 version
Version: 10
Git:     cee6b8ff76448e943dceda031ca2f467d5616926
vmarkovtsev commented 5 years ago

@senden9 I wonder if --languages "c#" will give you a better result.

senden9 commented 5 years ago

c# works!

Why I use "csharp":

❯ ./hercules.linux_amd64 --help
[...]
      --languages string [TreeDiff]             List of programming languages to analyze. Separated by comma ",". Names
                                                are at https://doc.bblf.sh/languages.html "all" is the special name
                                                which disables this filter and lets all the files through. The default
                                                value is "all".

Go to https://doc.bblf.sh/languages.html -> see that the key is named "csharp".