allenai / scholarphi

An interactive PDF reader.
Apache License 2.0
416 stars 52 forks source link

Pipe through mentionless bibs #362

Closed ca16 closed 2 years ago

ca16 commented 2 years ago

Related to https://github.com/allenai/scholar/issues/32733, and https://github.com/allenai/scholar/issues/32730.

This PR adjusts the code that saves citation items to include, for each reference, a representation of it independent of its inline mentions (if any). This allows us to have some representation of references that we have no inline mentions for, which is the goal.

I've made it so that these additional items are only saved in the file version of the output, not in the scholarphi DB too, because: a) We want this extra info for the new reader, and the new reader only looks in the file version of the output. b) I think confirming that it does not adversely affect things using the db version of the output would be a non-trivial amount more work than doing the relevant part of what's in this PR, and I don't think that the relevant part of what's in this PR introduces that much additional complexity.

Testing done: I've got corresponding changes for scholarphi-pipeline and s2airs here and here. I ran a couple of papers through the three systems including the changes, and put details in https://github.com/allenai/scholar/issues/32733. Summary: for both papers, we had the additional items representing references independent of inline mentions. This included, for the one paper where it was expected, references that had no inline mentions.