allenai / s2orc

S2ORC: The Semantic Scholar Open Research Corpus: https://www.aclweb.org/anthology/2020.acl-main.447/
800 stars 64 forks source link

Citation mention linked to bib entry without key #21

Open kyleclo opened 4 years ago

kyleclo commented 4 years ago

See paper_id=15213255:

{'section': 'Automating the search',
 'text': '(25) The judge announced that the defendant was guilty. Many classes of verbs have already been identified and are incorporated into the system (Nairn et al., 2006) : verbs relating to speech ($
.g., say, report, etc.), implicative verbs such as manage and fail (Karttunen, 2007) , and factive verbs (e.g. agree, realize, consider) (Vendler, 1967; Kiparsky and Kiparsky, 1971) , to name a few. Many
adjectives have also been added to the system, including ones taking to and that complements. 10 As with the complement-taking nouns, a significant part of the effort in incorporating the complement-taki$
g adjectives into the system was identifying which adjectives license complements. The adverbs have not been explored in as much depth.',
 'cite_spans': [{'start': 144,
   'end': 164,
   'text': '(Nairn et al., 2006)',
   'ref_id': 'BIBREF16',
   'arxiv_id': None,
   'paper_id': '525764'},
  {'start': 261,
   'end': 278,
   'text': '(Karttunen, 2007)',
   'ref_id': 'BIBREF10',
   'arxiv_id': None},
  {'start': 347,
   'end': 375,
   'text': 'Kiparsky and Kiparsky, 1971)',
   'ref_id': 'BIBREF11',
   'arxiv_id': None}],

See BIBREF11, yet:

citing_paper_dict['pdf_parse']['bib_entries'].keys()
dict_keys(['BIBREF0', 'BIBREF1', 'BIBREF2', 'BIBREF3', 'BIBREF4', 'BIBREF5', 'BIBREF6', 'BIBREF7', 'BIBREF8', 'BIBREF9', 'BIBREF10', 'BIBREF12', 'BIBREF13', 'BIBREF14', 'BIBREF15', 'BIBREF16', 'BIBREF17'])

Will need to double-check where it went.