Open esanzgar opened 3 years ago
The 'uri` property in the elasticsearch schema:
is associated with a custom uri
analyzer:
I don't know enough about elasticsearch, but it seems plausible that the analyzer could modified to treat DOIs (doi:.prefix/suffix
) in a case-insensitive manner, while treating the the rest of the URIs as currently done.
General explanation
API searches to DOIs should return same number of annotations independently to its casing.
This issue is related with:
Steps to reproduce
curl "https://hypothes.is/api/search?_separate_replies=false&group=NMb8iAjd&uri=doi:10.1097/JOM.0000000000000063"
curl "https://hypothes.is/api/search?_separate_replies=false&group=NMb8iAjd&uri=doi:10.1097/jom.0000000000000063"
Expected behaviour
Request 1) and 2) should return the same amount of annotations.
Actual behaviour
Request 1) returns 3 annotations, while 2) returns 0 annotations.
Additional details
DOI is case insensitive according to official the documentation: https://www.doi.org/doi_handbook/2_Numbering.html#2.4
This is likely affecting ~350 publication in Europe PMC, many more in pubmed and other editorial sites.
Example in pubmed: https://pubmed.ncbi.nlm.nih.gov/24806729/ It should display one annotation in the public group