Open kalexmills opened 3 years ago
Ok, maybe it's not so bad.
/ # cat data/visited_repos.csv | sort | uniq | wc -l
171887
/ # wc -l repos.csv
223664 repos.csv
It seems that VetBot is just writing duplicates into visited_repos.csv, which is read into a set.
So maybe it's actually visiting repositories twice. Still seems there is an issue, though.
Welp...
That's a thing.