enonic / xp

Enonic XP
https://enonic.com
GNU General Public License v3.0
201 stars 34 forks source link

accents gives different results (when it should not due to asciifolding) #7115

Closed ComLock closed 5 years ago

ComLock commented 5 years ago

These give different results fulltext('from', 'diare', 'OR', 'standard') fulltext('from', 'diaré', 'OR', 'standard')

https://github.com/enonic/lib-yase/issues/57

GlennRicaud commented 5 years ago

"standard" analyzer does not have asciifolding.

GlennRicaud commented 5 years ago

Your problem is that you are using the "standard" analyzer.

In Enonic XP 6, you could specify analyzer per document that was used mostly for fulltext. For issues or contents, this one is "document_index_default" (which has ascii folding). When you created your nodes you did not set an analyzer, so it took the the "standard" one.

In Enonic XP 7, there is no more analyzer per document. So it will always be "document_index_default"

So two solutions for me: Solution 1: You update the indexconfig of your nodes to have this analyzer "document_index_default" (both in your application and the existing data) Solution 2: You wait for / upgrade to Enonic XP 7

GlennRicaud commented 5 years ago

Note to self: Default analyzer taken for fulltext in 6.15: fulltext_search_default (identical to "document_index_default") Thus the need to use 'standard' in the queries of ComLock

GlennRicaud commented 5 years ago

@ComLock From code reading, it seems to be possible to set the analyzer, even though I have never tried it myself. It is not documented at all

For creation by passing the following index config

"_indexConfig": {
       "analyzer": "document_index_default"
        "default": {
            "decideByType": true,
            "enabled": true,
            "nGram": false,
            "fulltext": false,
            "includeInAllText": false,
            "path": false,
            "indexValueProcessors": []
        },
        "configs": []
    }

And to patch existing data by setting the analyzer:

node._indexConfig.analyzer = 'document_index_default';

To be tested

ComLock commented 5 years ago

@GlennRicaud

In Enonic XP 7.0

diare gives 3 results while diaré gives 0

So the "bug" is still present in Enonic XP 7.0

GlennRicaud commented 5 years ago

Have you removed the "standard" parameter from the fulltext query?

ComLock commented 5 years ago

When I remove "standard" it works :)

I guess this issue can be closed now?