allenai / ForeCite

Apache License 2.0
34 stars 4 forks source link

citing_ids vs references #9

Closed slvcsl closed 3 years ago

slvcsl commented 4 years ago

Hi, I'm trying to create a (patent) dataset mirroring the script in generate_dataset.py. However, I'm having trouble understanding how citations/references are managed.

In particular in full_data_collection_worker, what's the difference between citing_ids and references? My understanding is that giving a paper P, citing_ids contains the ids of all papers that cite P, while references contains all P's references. Am I correct?
Thanks :)

dakinggg commented 3 years ago

Hi, apologies that I am just now seeing this! Your interpretation is correct! Just to make it crystal clear, if paper 1 cites paper 2, paper 1 would appear in paper 2's citing_ids, and paper 2 would appear in paper 1's references.