clarin-eric / VLO

Virtual Language Observatory
GNU General Public License v3.0
14 stars 6 forks source link

Compounds and phrases in suggester #51

Open twagoo opened 7 years ago

twagoo commented 7 years ago

Suggested by Dieter: it would be nice if the auto complete of the search box in the VLO user interface could suggest terms like 'middle ages', 'Noam Chomsky', 'eye tracking', ... Could the importer be adapted, e.g. by using a dictionary or searching for common n-grams, to include such phrases in the _suggester facet?

Maybe an easy alternative solution is to include all (common) values from a selection of facets (language, subject, genre, subject, ...) to the _suggester facet, or (maybe a cleaner and leaner approach) include those in the query for search terms from the client.


Transferred from Trac, issue #587

twagoo commented 7 years ago

(maybe a cleaner and leaner approach) include those in the query for search terms from the client.

To clarify, I believe that my thinking here was to extend the autocomplete service to combine the response of the Solr suggester (/suggest) with an additional query that gets matching values for selected facets. This probably was more of a brainwave than a fully fledged implementation proposal but it may be an alternative to consider if this cannot easily be fixed on the Solr server side.

teckart commented 7 years ago

After investing many hours in testing various Solr configurations (tokenizer/filter/spellchecker) I couldn't identify one that provides a satisfying result based on Solr directly (although a more experienced Solr user may come up with one). Main problems:

twagoo commented 7 years ago

Would things improve (in any way) if we switch from using the SpellCheckComponent (see solrconfig.xml to Suggester for this purpose? This is suggested in the official documentation (also see this comment) but I have not looked into the differences between the two to know what the required effort and consequences would be.

twagoo commented 7 years ago

We have looked into this and searched for expertise but were not successful in finding a usable solution. For now we are not going to put any substantial effort into this.

menzowindhouwer commented 6 years ago

Maybe interesting: "semantic query suggestions" http://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1006058

twagoo commented 6 years ago

Not directly related but also see #193