Closed ffr4nz closed 7 years ago
thanks @ffr4nz I will take a clear look at this and will merge the PR. Btw, do you think it would be good to have both features of monitoring pushes as well as only the merges?
IMHO monitoring every push you can more possibilities to find sensitive data. However we can add a simple flag to select when you want to search.
I would argue this is outside of the scope of this project. This would take it towards the realm of OSINT. If you want to begin mass scanning GitHub, that's fine, but know that doing so will open up a can of worms. It is really only effective for catching around 5% of commits with 3-4 API keys scraping at the same time within the ratelimit from previous tests I've run. It returns around five or six results with my list(around 1.4x the size), per day running all day with four keys. To be fair, that's with false positive removal, but it could also become an issue given the rate at which this list can return false positives. I noted around 10% false positives, after already when removing 7% of results from the raw set. It's extremely difficult to accurately scale this kind of scanning, and if you decide to go that route, you should do it correctly to avoid plaguing innocent project authors with false reports. If you truly do want to take it in this direction, let's chat @techguan, I'm in the middle of working on just this kind of thing. If you'd like, I can move a bunch of my systems to this project to join forces. But again, this would drastically shift the direction of the project.
If you want an idea of what a timeline scanning system looks like, here's an ultra stripped down version I wrote a while ago: https://github.com/Plazmaz/GHScraper
@ffr4nz @Plazmaz Sorry for not getting back to this earlier.. I looked at this and I don't think this is the route this tool wants to go to(apart from the technical problems related to rate limiting while scanning global feeds). We could build a separate tool for this purpose (or contribute to what @Plazmaz has built). @ffr4nz As always, I appreciate your contribution but lets not make this feature part of github-dorks for now.
Added global timeline monitoring feature. Now you can monitor all users pushed on GitHub thanks to global timeline feed provided by Github.