edsu / etudier

Extract a citation network from Google Scholar
161 stars 27 forks source link

Error handling GS HTML, likely due to "Related searches" section always after 4th result #14

Closed wigasper closed 4 years ago

wigasper commented 4 years ago

Getting:

... line 146, in get_metadata title = e.find('.gs_rt .gs_ctu', first=True).text AttributeError: 'NoneType' object has no attribute 'text'

Please excuse if there is a typo, not copying and pasting as the error is on another machine. I think this issue is because there is often a "Related searches" result after the fourth result, but I could be entirely wrong. This seems to happen every time though after the fourth result on the page.

If I have some time I may eventually get to correcting this and will submit a pull request.

Also, would you consider adding a permissive license?