allenai / scidocs

Dataset accompanying the SPECTER model
Other
127 stars 18 forks source link

MeSH papers not found in paper_metadata_mag_mesh.json #19

Closed yuzhimanhua closed 3 years ago

yuzhimanhua commented 3 years ago

Hi, thank you for sharing the datasets and evaluation scripts!

I have a question when I try to select the MeSH dataset (23K papers, 11 classes) from data/paper_metadata_mag_mesh.json. Specifically, I use paper ids in data/mesh/train.csv to match the paper ids in data/paper_metadata_mag_mesh.json. However, some ids cannot be found (e.g., the first paper in ``train,csv```, dda7899c46b7764ed16ab3092f4f9476a9cecedf). Am I missing something? Thanks!

sergeyf commented 3 years ago

Hmm, I see it in there:

In [4]: data_paths = DataPaths()

In [5]: with open(data_paths.paper_metadata_mag_mesh, 'r') as f: D = json.load(f)

In [6]: 'dda7899c46b7764ed16ab3092f4f9476a9cecedf' in D
Out[6]: True
yuzhimanhua commented 3 years ago

My bad. Thank you for your quick response.

sergeyf commented 3 years ago

No problem!