hbz / lobid

Linking Open Bibliographic Data
https://lobid.org/
Eclipse Public License 2.0
16 stars 4 forks source link

GND subject name doesn't favour exact matches #138

Closed dr0i closed 6 years ago

dr0i commented 9 years ago

Querying http://lobid.org/subject?name=verkehr the first hits are not docs with exact matches but word compunds. The cause is the usage of ngram_analyzer in the index profile to enable auto suggestion. However, a subject name query should result in only exact matches or, maybe even nicer, serving also word compunds but with favouring the exact match.

dr0i commented 9 years ago

Now, while with https://github.com/hbz/lobid/commit/2dae1bfdee9c998446327b1bb4ab1e09b9fb77ab there are somewhat better results (e.g. http://test.lobid.org/subject?name=verkehr vs. http://lobid.org/subject?name=verkeh) , they are not perfect (e.g. http://test.lobid.org/subject?name=stadt vs. http://lobid.org/subject?name=stadt ). Seems the only safe way do do it is to have multiple fields as suggested in https://stackoverflow.com/questions/23686577/favor-exact-matches-over-ngram-matches-in-elasticsearch-when-mapping .

dr0i commented 9 years ago

Question is: should be merge this slightly, if ever, improvement?

acka47 commented 9 years ago

Multiple fields/indices are the way to go I think. I'd suggest two different query options over the same field. We will have to add a parameter, then. The auto-suggest use case shouldn't be that central especially as we have seen that it isn't used, anyway.

acka47 commented 6 years ago

Fixed in lobid-gnd, see http://lobid.org/gnd/search?q=verkehr. Closing.