allenai / s2orc

S2ORC: The Semantic Scholar Open Research Corpus: https://www.aclweb.org/anthology/2020.acl-main.447/
800 stars 64 forks source link

Unique authors' identifiers #39

Closed AmenRa closed 2 years ago

AmenRa commented 2 years ago

Hi and thanks for the dataset!

Is there any plan to release an updated version of the dataset with unique authors' identifiers?

Right now, building an authorship / co-authorship graph with the available data does, unfortunately, lead to severe noise issues.

As far as I understand, the dataset is somewhat linked to the Semantic Scholar corpus, which should make this doable for you.

Any help on this would be great.

Best,

Elias

lucylw commented 2 years ago

Hi Elias,

We do not currently have plans to incorporate author IDs into S2ORC. You can use the Semantic Scholar API to retrieve associated author IDs from Semantic Scholar.

-Lucy

AmenRa commented 2 years ago

Hi Lucy and thanks for your answer. I will take a look at the API.