scholarly-python-package / scholarly

Retrieve author and publication information from Google Scholar in a friendly, Pythonic way without having to worry about CAPTCHAs!
https://scholarly.readthedocs.io/
The Unlicense
1.36k stars 298 forks source link

Convert Publication Source #521

Closed DLu closed 11 months ago

DLu commented 11 months ago

[my apologies if this is covered elsewhere...my first search didn't come up with anything]

What feature would you like to request? The ability to convert from one publication source to another.

Is your feature request related to a problem? Please describe. I would like to be able to

However, searching returns a pub with source PUBLICATION_SEARCH_SNIPPET and it needs to be AUTHOR_PUBLICATION_ENTRY.

Describe the solution you'd like Externally, I have code that does this by

This seems like it would help torward doing away with the publication source altogether, so that you can "fill" the various sections.

Describe alternatives you've considered It can be done externally, but the current implementation is a little confusing.

Do you plan on contributing? Your response below will clarify if this is something that the maintainers can expect you to work on or not.

arunkannawadi commented 11 months ago

This is not going to be guaranteed in general, is it? a) not all authors have a Google Scholar profile that is public b) GS is going to link the Google Scholar pages of only the first few authors and not for everyone.

I'd be very happy to consider a PR from you if you could identify me (or any other GS profile) as an author on the first paper on this search result, and get the cites_per_year from my (or any other) profile.

arunkannawadi commented 11 months ago

However, searching returns a pub with source PUBLICATION_SEARCH_SNIPPET and it needs to be AUTHOR_PUBLICATION_ENTRY.

As you've found out, publications in GS have different properties based on which database the result is fetched from. The Source attribute is meant to denote that, and indicate what other attributes to expect. In general, we leave it applications that build around scholarly to do the conversions that you mentioned here. This is because there are plenty of exceptions to the norm, and it's difficult to handle them all in general without knowing what you as a user are looking for. For e.g., if none of the authors of a paper have a public Google Scholar profile, there's no way to convert a result from PUBLICATION_SEARCH_SNIPPET to AUTHOR_PUBLICATION_ENTRY. So while the latter is usually the more useful one, the former will have to stay because it's the more generic one.

DLu commented 11 months ago

Ah, I see. I did not realize the operative property was having a GS profile. That makes more sense. Thanks!