allenai / scidocs

Dataset accompanying the SPECTER model
Other
127 stars 18 forks source link

some question about subdataset #22

Closed ZzyChris97 closed 2 years ago

ZzyChris97 commented 2 years ago

The total num of the paper for recom is about 36k and is too big for me. I just need about 10% ~ 20%. Choosing data randomly seems not good. How can i divide the data and keep citations wherever possible? Do you have any suggestions?

Looking forward to your reply and thanks a lot.

sergeyf commented 2 years ago

Sorry to say that we've never considered this use case. SciDocs is intended to be used as is, or otherwise it is not comparable to other published results.