scholarly-python-package / scholarly

Retrieve author and publication information from Google Scholar in a friendly, Pythonic way without having to worry about CAPTCHAs!
https://scholarly.readthedocs.io/
The Unlicense
1.37k stars 298 forks source link

Multiple Versions of a Paper #427

Closed picheny-nyu closed 2 years ago

picheny-nyu commented 2 years ago

It looks like this was supported a few years ago but then there is a comment in one of the posts that suggests this functionality is now gone. Can someone tell me what the status is?

Thanks Michael Picheny

arunkannawadi commented 2 years ago

You should still be able to fetch them. See the doc here: https://github.com/scholarly-python-package/scholarly/blob/de817e715a06e32aa4a938408fa6763c375ef27c/scholarly/data_types.py#L170-L185

arunkannawadi commented 2 years ago

We don't have an in-built method to do that. But you can fetch the cites_id, construct the url as

f"/scholar?oi=bibs&hl=en&cluster={cites_id}"

, and use it as the input to scholarly.search_pubs_custom_url method. You'll need a good proxy setup to fetch these results though.

picheny-nyu commented 2 years ago

Thanks! What do you mean by a "good" proxy setup? I have been using FreeProxy (I don't have any of the others).

arunkannawadi commented 2 years ago

I meant it in the sense that looking for all versions of a paper is one of those things that Google Scholar actively tries to block. FreeProxy should probably suffice, but is not always reliable. If that doesn't work, you might want to try it with ScraperAPI or Luminati.

picheny-nyu commented 2 years ago

Thank you. One more question - I don't see how to get the "cites_id" from the api. Could you point me to an example?

Thanks Michael

arunkannawadi commented 2 years ago

If you are starting from your profile, you can follow this example in the quickstart. The pub variable at the end of the example should have a key that is cites_id. https://scholarly.readthedocs.io/en/stable/quickstart.html#example

arunkannawadi commented 2 years ago

@picheny-nyu Can we close this issue?

picheny-nyu commented 2 years ago

Yes we may!