POLDER-Crew / polder-federated-search

A federated search project for POLDER.
BSD 3-Clause "New" or "Revised" License
5 stars 1 forks source link

Things you might do to find a DOI break both SPARQL and Solr searches #170

Open yemoski opened 1 year ago

yemoski commented 1 year ago
  1. Querying by just the DOI url fragment breaks SPARQL. In this case, the query was 10.25976/a7r5-rv73 and I think the forward slash is the culprit here. We probably need to escape those: Query evaluation error: com.ontotext.trree.sdk.InternalServerErrorException: Cannot parse Lucene query [10.25976/a7r5-rv73]

  2. Using an whole DOI url breaks Solr. You can see that it gets url-encoded but something is still causing a problem: 400 Client Error: Bad Request for url: https://search.dataone.org/cn/v2/query/solr/?start=0&fq=(northBoundCoord:%5B50%20TO%20*%5D%20OR%20southBoundCoord:%5B*%20TO%20-50%5D)%20AND%20-obsoletedBy:*&q=https%3A//doi.org/10.25976/a7r5-rv73&fq=(beginDate:%5B*%20TO%20NOW%5D%20AND%20endDate:%5B*%20TO%20NOW%5D)&rows=50&wt=json&fl=*,score

nein09 commented 1 year ago

point 1: Lucene thinks that is a regex: https://stackoverflow.com/questions/17798300/lucene-queryparser-with-in-query-criteria