This PR adjusts the code that saves citation items to include, for each reference, a representation of it independent of its inline mentions (if any). This allows us to have some representation of references that we have no inline mentions for, which is the goal.
I've made it so that these additional items are only saved in the file version of the output, not in the scholarphi DB too, because:
a) We want this extra info for the new reader, and the new reader only looks in the file version of the output.
b) I think confirming that it does not adversely affect things using the db version of the output would be a non-trivial amount more work than doing the relevant part of what's in this PR, and I don't think that the relevant part of what's in this PR introduces that much additional complexity.
Testing done:
I've got corresponding changes for scholarphi-pipeline and s2airs here and here. I ran a couple of papers through the three systems including the changes, and put details in https://github.com/allenai/scholar/issues/32733. Summary: for both papers, we had the additional items representing references independent of inline mentions. This included, for the one paper where it was expected, references that had no inline mentions.
Related to https://github.com/allenai/scholar/issues/32733, and https://github.com/allenai/scholar/issues/32730.
This PR adjusts the code that saves citation items to include, for each reference, a representation of it independent of its inline mentions (if any). This allows us to have some representation of references that we have no inline mentions for, which is the goal.
I've made it so that these additional items are only saved in the file version of the output, not in the scholarphi DB too, because: a) We want this extra info for the new reader, and the new reader only looks in the file version of the output. b) I think confirming that it does not adversely affect things using the db version of the output would be a non-trivial amount more work than doing the relevant part of what's in this PR, and I don't think that the relevant part of what's in this PR introduces that much additional complexity.
Testing done: I've got corresponding changes for scholarphi-pipeline and s2airs here and here. I ran a couple of papers through the three systems including the changes, and put details in https://github.com/allenai/scholar/issues/32733. Summary: for both papers, we had the additional items representing references independent of inline mentions. This included, for the one paper where it was expected, references that had no inline mentions.