syrusakbary / gdom

DOM Traversing and Scraping using GraphQL
http://gdom.graphene-python.org
BSD 3-Clause "New" or "Revised" License
1.24k stars 41 forks source link

Demo only scrapes hackernews? #10

Open Joshfindit opened 6 years ago

Joshfindit commented 6 years ago

Just tried exploring the queries with Tesco as per https://news.ycombinator.com/item?id=11180732 and the results are empty. Also tried other sites besides hackernews, and could not get results.

It's understandable if it's whitelist-only, but there's no documentation about it which leads users (me) to think that either we're doing it wrong or it's broken.

mrVanDalo commented 6 years ago

this is most likely because theses sides use something like react to render there pages, which means the rendering of the html is done by your browser. But if a page is delivered in html (not in js which generates html) it works for me.

mrVanDalo commented 6 years ago

Hmm seems the page is not generated by js. Maybe there is whitelisting (but I don't think so), try to run your own instance, I created a docker container a while ago which should make a test run very easy .

https://hub.docker.com/r/palo/gdom/