Closed seanmacavaney closed 3 years ago
This plan is a little problematic when multiple datasets have the same citation. Such as for msmarco-document
and msmarco-passage
. Maybe it's just up to the user to resolve duplicates on their own? We could put a warning comment at the top.
I think a better way to handle citations is to:
\cite{x,y}
and bibtex versions.
We can leverage the bibtex we have in the documentation to build a master bibtex file. People could import it into their papers and then reference datasets with their ird IDs, e.g.,
\cite{cord19/trec-covid}
.Will need to standardize the naming of the records (right now it normally uses author, year, and a few words from the title). Can we automatically fill these in by leaving a placeholder?
In cases where there are multiple entries for a given dataset, we can use something like
\cite{cord19/trec-covid/1,cord19/trec-covid/2}
.