PR https://github.com/TranslatorSRI/NameResolution/pull/143 broke our hyphen-based test, which was beta-secretase. If we use the query (beta\-secretase) OR (beta\-secretase*) (as we used to before PR 143) then everything works fine. But when we change it to (beta\-secretase*) (as we do in PR 143), this stops working -- although (and this is the weird bit) (beta\-secretase) works just fine (i.e. when autocomplete=False).
I'm not sure what's wrong -- the one clue I have is that hyphens in the preferred name still works, so this is probably something to do with solr.StandardTokenizerFactory we use to tokenize names, which specifically splits on hyphens. The answer might be to choose a better tokenizer or something.
For now, the most comprehensive solution appears to be to replace special characters with spaces in autocomplete (i.e. (beta secretase*)), but escaping them without autocomplete (i.e. (beta\-secretase)). That's what I've done in PR https://github.com/TranslatorSRI/NameResolution/pull/143, but we should figure out if there's a better solution here.
PR https://github.com/TranslatorSRI/NameResolution/pull/143 broke our hyphen-based test, which was
beta-secretase
. If we use the query(beta\-secretase) OR (beta\-secretase*)
(as we used to before PR 143) then everything works fine. But when we change it to(beta\-secretase*)
(as we do in PR 143), this stops working -- although (and this is the weird bit)(beta\-secretase)
works just fine (i.e. when autocomplete=False).I'm not sure what's wrong -- the one clue I have is that hyphens in the preferred name still works, so this is probably something to do with
solr.StandardTokenizerFactory
we use to tokenize names, which specifically splits on hyphens. The answer might be to choose a better tokenizer or something.For now, the most comprehensive solution appears to be to replace special characters with spaces in autocomplete (i.e.
(beta secretase*)
), but escaping them without autocomplete (i.e.(beta\-secretase)
). That's what I've done in PR https://github.com/TranslatorSRI/NameResolution/pull/143, but we should figure out if there's a better solution here.