allenai / s2-folks

Public space for the user community of Semantic Scholar APIs to share scripts, report issues, and make suggestions.
Other
144 stars 25 forks source link

Development of author bulk search feature, similar to paper bulk search #149

Closed RobotPsychologist closed 7 months ago

RobotPsychologist commented 8 months ago

Is your feature request related to a problem? Please describe. Not exactly a problem, but I am interested in studying authorship patterns across fields. Which authors write in multiple fields, publication frequency, filtering on h-index and citation count. A way to bulk download authors based on a condition would allow me to then search papers by the most prominent authors by year, by field, etc.

Describe the solution you'd like Essentially the same thing as Paper Bulk Search.

Describe alternatives you've considered Right now I'm just focusing on the analysis of papers by field, I've considered papers bulk searches, and inferring the most influential authors from the most influential papers, but this is likely to miss some.

Additional context N/A

cfiorelli commented 8 months ago

Outstanding action item / notes for me from connecting with RK on this User asking for bulk author search. Thoughts/timeline?

ericchagnon15 commented 8 months ago

Just to clarify, is there currently a way to find all the papers from a given author? Or is that what is under development?

cfiorelli commented 8 months ago

@ericchagnon15 (+ @RobotPsychologist upon reread maybe this solves your ask too?)

I think a great solution might be leveraging the datasets. I've set bold text on the papers and authors items to show what im referring to.

Latest Release ID: 2023-10-24

Available datasets in the latest release:

- Name: authors Description: The core attributes of an author (name, affiliation, paper count, etc.). Authors have an "authorId" field, which can be joined to the "authorId" field of the members of a paper's "authors" field. 75M records in 30 100MB files.

- Name: papers Description: The core attributes of a paper (title, authors, date, etc.). 200M records in 30 1.5GB files.