nelsonic / github-scraper

🕷 🕸 crawl GitHub web pages for insights we can't GET from the API ... 💡
425 stars 96 forks source link

Get info from the new metric "Used By" #106

Open matheusflauzino opened 5 years ago

matheusflauzino commented 5 years ago

GitHub introduced a new metric, which I find much more relevant than stars or downloads. It shows the number of times a dependency is used by other repositories. It would be nice to have that metric added to Github-scraper.

rep1 rep2
nelsonic commented 5 years ago

@matheusflauzino agree 100% that the Used by metric is useful. 👍 If you or anyone else has time to add the test(s) and selector in a Pull Request, I'd be happy to merge it.

nelsonic commented 5 years ago

I just took a quick look into this and even though it appears to be there in the DOM: https://github.com/dwyl/decache Screenshot 2019-07-04 at 09 17 57

image

But sadly, it does not appear when the page is requested via node.js ... Our node.js http.request only receives the html and none of the client-side rendered javascript...

Try it for yourself, disable JS in your browser and refresh the page: image

No more "Used by" ... image

We could get around this by using a Chromium instance (e.g: PhantomJS) and waiting for the JS to finish rendering. However, I checked and it won't work because we would need a logged in session in PhantomJS (the counter is not shown if you visit the page in incognito mode...).

nelsonic commented 5 years ago

Suggestions very much welcome for how to overcome this ... 👍