Closed acka47 closed 2 years ago
We actually support that via the .ascii
subfield (see https://github.com/hbz/lobid-gnd/issues/263 and 'ASCII' sample at http://lobid.org/gnd/api), e.g.: https://lobid.org/gnd/search?q=preferredName.ascii:gandhari
Yes, I just learned about that. The question was rather whether it would make sense to support this as the default search mode when entering search terms in the search box on the web interface. People who are not familiar with the data model and query syntax may not be aware that this does not work (since it does actually work on the DNB site when searching the GND).
We actually support that via the
.ascii
subfield (see https://github.com/hbz/lobid-gnd/issues/263 and 'ASCII' sample at http://lobid.org/gnd/api), e.g.: https://lobid.org/gnd/search?q=preferredName.ascii:gandhari
@fsteeg Please check out the Twitter thread at https://twitter.com/felwert/status/1504079048045125636 where we have already discussed this.
We should probably not change the behaviour of the q
query as the issue title suggests because this would significantly change the API's behaviour (API break). Perhaps it might make sense to add a parameter (ascii=true
or so) that includes ascii name variants in the q search which we set in the lobid-gnd UI as default. What do you think, @fsteeg?
Perhaps it might make sense to add a parameter (ascii=true or so) that includes ascii name variants in the q search which we set in the lobid-gnd UI as default.
Yes, that sounds good.
Ok, I updated the issue title and assigned you, @fsteeg. We'll sort out in our fortnitely planning on Monday when to implement this.
We'll sort out in our fortnitely planning on Monday when to implement this.
We plan to implement in April.
Hm, looking into implementing this, I'm seeing that how this currently works is that we first do a search, and then, depending on the requested format, we return a specific response format of that search result, one of that being HTML for the UI. And that's how it should be, right? To implement this, I'd now change the way we search depending on the response format. That seems wrong.
Perhaps we should reconsider and include the ascii
subfields for all requests as default?
Perhaps we should reconsider and include the
ascii
subfields for all requests as default?
I've deployed that for review on test, e.g. this would then contain the additional hits:
https://test.lobid.org/gnd/search?q=gandhari
On the other hand we'd get lots of false results (here about Münster) for a query like this:
https://test.lobid.org/gnd/search?q=munster
@acka47 What do you think?
On the other hand we'd get lots of false results (here about Münster) for a query like this:
This gets the same results as on production (https://lobid.org/gnd/search?q=munster) as we have already implemented german_normalization
(see index config). So, with supporting ASCII folding for other languages, we bascially add more consistency to the search behaviour. Thus, +1.
Originated today in this Twitter thread with @frederik-elwert: https://twitter.com/felwert/status/1504079048045125636
For example, when searching for "gandhari" via UI search box, results should include https://lobid.org/gnd/4669633-7.