src-d / hercules

Gaining advanced insights from Git repository history.
Other
2.63k stars 334 forks source link

[performance] The burndown analysis runs forever on the C++ driver repo #294

Closed warenlg closed 5 years ago

warenlg commented 5 years ago

I'm not able to process some driver repos from bblfsh codebase with the following command. Particularly the C++ driver https://github.com/bblfsh/cpp-driver that seems to be a small normal repo but the job runs forever: it has not finished after one night. Other driver repos are fine (less than 1 sec processing) apart from the JS and Go drivers that required 20min. I used the version 9.2.0 of the executable, locally. Ubuntu 18.04.2

./hercules_v9.2.0 --pb --burndown --burndown-people --burndown-files --devs --couples --hibernation-distance=1000 --skip-blacklist path_to_repo

vmarkovtsev commented 5 years ago

Questions:

warenlg commented 5 years ago

100% CPU and it hangs on this commit https://github.com/bblfsh/cpp-driver/commit/68d4981a185243d88510f7cc7cf9b8da3617455a

warenlg commented 5 years ago

Same thing here actually,

https://github.com/bblfsh/go-driver/commit/26c8f55eab35de5648bd0a1cf8a663d32cf3c1fe https://github.com/bblfsh/go-driver/commit/2995529cf147987ee244abef083aca8e915f1fee https://github.com/bblfsh/javascript-driver/commit/f63272417f57f7d7675f832112ed44c92a1b4a1e

vmarkovtsev commented 5 years ago

Root cause: too big text files make diffmatchpatch enter an infinite cycle. I need to add a timeout for diff-ing.