nextcloud / fulltextsearch_elasticsearch

🔍 Use Elasticsearch to index the content of your Nextcloud
https://apps.nextcloud.com/apps/fulltextsearch_elasticsearch
GNU Affero General Public License v3.0
81 stars 30 forks source link

User and owner filters #237

Closed silvioq closed 1 year ago

silvioq commented 1 year ago

The use of "keyword" will allow the search key to be exact, Elasticsearch will not perform any transformations. This use case will consider external user sources (LDAP), where the username may have dashes or symbols that could be misinterpreted by the engine.

Signed-off-by: Silvio silvioq@gmail.com

vbier commented 1 year ago

If only I had seen your pull request earlier. It took me a complete day to find out why the search did not work for me. This PR fixes: https://github.com/nextcloud/fulltextsearch_elasticsearch/issues/300

Edit: I am wondering how this pull request can be open for more than half a year. Is Nextcloud not used in corporate environments with AD integration? This is a killer bug for us, as it effectively breaks fulltext search for all our users.

silvioq commented 1 year ago

If only I had seen your pull request earlier. It took me a complete day to find out why the search did not work for me. This PR fixes: #300

Edit: I am wondering how this pull request can be open for more than half a year. Is Nextcloud not used in corporate environments with AD integration? This is a killer bug for us, as it effectively breaks fulltext search for all our users.

Hi. We have a patch on our environment and we apply it after every upgrade.

vbier commented 1 year ago

@silvioq , can you try to request a review? See https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/proposing-changes-to-your-work-with-pull-requests/requesting-a-pull-request-review. A suitable reviewer might be R0Wi or ArtificialOwl. There has to be a way to get the changes merged.

vbier commented 1 year ago

I have approved the changes, but that does not seem to help as I do not have the needed write permissions on the repository. That is why I mentioned R0Wi and ArtificialOwl as reviewers. Can you request a review from one of them?

silvioq commented 1 year ago

Sorry. Only @vbier appears to review for me. I can close this pull and create again.

imagen

vbier commented 1 year ago

@R0Wi , @ArtificialOwl , sorry to bother you. Can you help top get the changes merged?

R0Wi commented 1 year ago

@ArtificialOwl this seems to be closely related to https://github.com/nextcloud/fulltextsearch_elasticsearch/pull/265 where you fixed the uppercase group filter by just lowercasing the input. I was working together with @Kdubs937 recently because he was facing a similar issue when trying to search for files from an external provider (source = files_external) where the users array contained usernames with uppercase letters.

My fix here was to also call strtolower() on owner and users (see this commit) before sending the query to ES and this also fixed the issue.

But to me using users.keyword and owner.keyword seems to be the even cleaner approach, I'd only like to use the same strategie for users, owner, groups and circles and not mix it up. Btw: I tested this approach locally and it also works so the expected files_external documents are showing up again properly.

@silvioq did you find some official ES docs where they describe the behaviour of putting the .keyword suffix into your query?

R0Wi commented 1 year ago

I took the liberty to also add the .keyword suffix to both groups and circles. I also rebased this branch onto master and provided the appropriate test adjustments in https://github.com/nextcloud/fulltextsearch/pull/771. You'll see that these tests will fail if you don't apply the patch of this PR.

Thanks @silvioq and @vbier for your work!

silvioq commented 1 year ago

Thanks @R0Wi I'll test the changes in our nextcloud instance soon.

About usage of .keyword sufix, you can read there: https://www.elastic.co/guide/en/elasticsearch/reference/8.9/text.html#before-enabling-fielddata

XueSheng-GIT commented 1 year ago

I've applied this PR to NC27.1.0 and Fulltextsearch_elasticsearch 27.0.2 but afterwards fulltext search does not work anymore and test fails (see below). Applying https://github.com/nextcloud/fulltextsearch/pull/771 does not solve this issue.

Any idea? Any additional patch required to get this working?

# sudo -u www-data php /var/www/nextcloud/occ fulltextsearch:test

.Testing your current setup:  
Creating mocked content provider. ok  
Testing mocked provider: get indexable documents. (2 items) ok  
Loading search platform. (Elasticsearch) ok  
Testing search platform. ok  
Locking process ok  
Removing test. ok  
Pausing 3 seconds 1 2 3 ok  
Initializing index mapping. ok  
Indexing generated documents. ok  
Pausing 3 seconds 1 2 3 ok  
Retreiving content from a big index (license). (size: 32386) ok  
Comparing document with source. ok  
Searching basic keywords:  
 - 'test' (result: 0, expected: ["simple"]) fail  
Error detected, unlocking process ok 
In Test.php line 675:

  Unexpected SearchResult: {"provider":{"id":"test_provider","name":"Test Provider"},"platform":{"id":"elastic_search","name":"Elasticsearch"},"documents":[],"info":[],"meta":{"timed  
  Out":false,"time":3,"count":0,"total":0,"maxScore":0}}                                                                                                                                
R0Wi commented 1 year ago

Thanks @XueSheng-GIT for your feedback! I tested with a clean install of Nextcloud 28 (current master) together with Elasticsearch 8.6.1 and this worked for me. To get additional information out of the logs, you might want to apply this patch and lower the server loglevel to 0. Then, please rerun your test via occ-command. Hopefully the server log file should contain additional information afterwards which you could share with us 👍

XueSheng-GIT commented 1 year ago

@R0Wi Thanks for providing some guidance! I'm using elasticsearch 8.10 on ubuntu 22.04.

Log for error shown above https://github.com/nextcloud/fulltextsearch_elasticsearch/pull/237#issuecomment-1723580865:

{"reqId":"5SX4qoTiPy5iS9tMVEzS","level":0,"time":"2023-09-19T11:30:04+02:00","remoteAddr":"","user":"--","app":"fulltextsearch_elasticsearch","method":"","url":"/occ","message":"Headers: {\"Host\":[\"localhost:9200\"],\"Accept\":[\"application\\/vnd.elasticsearch+json; compatible-with=8\"],\"Content-Type\":[\"application\\/vnd.elasticsearch+json; compatible-with=8\"],\"User-Agent\":[\"elasticsearch-php\\/8.6.1 (Linux 6.2.16-6-pve; PHP 8.1.2-1ubuntu2.14)\"],\"x-elastic-client-meta\":[\"es=8.6.1,php=8.1.2,t=8.7.0,a=0,gu=7.7.0\"]}\nBody: {\"query\":{\"bool\":{\"must\":{\"bool\":{\"should\":[{\"match_phrase_prefix\":{\"content\":\"test\"}},{\"match_phrase_prefix\":{\"title\":\"test\"}}]}},\"filter\":[{\"bool\":{\"must\":{\"term\":{\"provider\":\"test_provider\"}}}},{\"bool\":{\"should\":[{\"term\":{\"owner.keyword\":\"user1\"}},{\"term\":{\"users.keyword\":\"user1\"}},{\"term\":{\"users.keyword\":\"__all\"}}]}},{\"bool\":{\"should\":[]}},{\"bool\":{\"must\":[]}},{\"bool\":{\"must\":[]}}]}},\"highlight\":{\"fields\":{\"content\":{}},\"pre_tags\":[\"\"],\"post_tags\":[\"\"]}}","userAgent":"--","version":"27.1.0.7","data":{"app":"fulltextsearch_elasticsearch"}}
{"reqId":"5SX4qoTiPy5iS9tMVEzS","level":1,"time":"2023-09-19T11:30:04+02:00","remoteAddr":"","user":"--","app":"fulltextsearch_elasticsearch","method":"","url":"/occ","message":"Response (retry 0): 200","userAgent":"--","version":"27.1.0.7","data":{"app":"fulltextsearch_elasticsearch","response":"{\"[object] (GuzzleHttp\\Psr7\\Response)\":{\"GuzzleHttp\\Psr7\\ResponsereasonPhrase\":\"OK\",\"GuzzleHttp\\Psr7\\ResponsestatusCode\":200,\"GuzzleHttp\\Psr7\\Responseheaders\":{\"X-elastic-product\":[\"Elasticsearch\"],\"content-type\":[\"application/vnd.elasticsearch+json;compatible-with=8\"],\"Transfer-Encoding\":[\"chunked\"]},\"GuzzleHttp\\Psr7\\ResponseheaderNames\":{\"x-elastic-product\":\"X-elastic-product\",\"content-type\":\"content-type\",\"transfer-encoding\":\"Transfer-Encoding\"},\"GuzzleHttp\\Psr7\\Responseprotocol\":\"1.1\",\"GuzzleHttp\\Psr7\\Responsestream\":{\"[object] (GuzzleHttp\\Psr7\\Stream)\":{\"GuzzleHttp\\Psr7\\Streamstream\":\"[resource] Resource id #1895\",\"GuzzleHttp\\Psr7\\Streamsize\":null,\"GuzzleHttp\\Psr7\\Streamseekable\":true,\"GuzzleHttp\\Psr7\\Streamreadable\":true,\"GuzzleHttp\\Psr7\\Streamwritable\":true,\"GuzzleHttp\\Psr7\\Streamuri\":\"php://temp\",\"GuzzleHttp\\Psr7\\StreamcustomMetadata\":[]}}}}","retry":"0"}}
{"reqId":"5SX4qoTiPy5iS9tMVEzS","level":0,"time":"2023-09-19T11:30:04+02:00","remoteAddr":"","user":"--","app":"fulltextsearch_elasticsearch","method":"","url":"/occ","message":"Headers: {\"X-elastic-product\":[\"Elasticsearch\"],\"content-type\":[\"application\\/vnd.elasticsearch+json;compatible-with=8\"],\"Transfer-Encoding\":[\"chunked\"]}\nBody: {\"took\":1,\"timed_out\":false,\"_shards\":{\"total\":1,\"successful\":1,\"skipped\":0,\"failed\":0},\"hits\":{\"total\":{\"value\":0,\"relation\":\"eq\"},\"max_score\":null,\"hits\":[]}}","userAgent":"--","version":"27.1.0.7","data":{"app":"fulltextsearch_elasticsearch"}}
{"reqId":"5SX4qoTiPy5iS9tMVEzS","level":1,"time":"2023-09-19T11:30:04+02:00","remoteAddr":"","user":"--","app":"fulltextsearch_elasticsearch","method":"","url":"/occ","message":"Response time in 0.006 sec","userAgent":"--","version":"27.1.0.7","data":{"app":"fulltextsearch_elasticsearch"}}
{"reqId":"5SX4qoTiPy5iS9tMVEzS","level":0,"time":"2023-09-19T11:30:04+02:00","remoteAddr":"","user":"--","app":"fulltextsearch_elasticsearch","method":"","url":"/occ","message":"result from ES","userAgent":"--","version":"27.1.0.7","data":{"app":"fulltextsearch_elasticsearch","result":"{\"[object] (Elastic\\Elasticsearch\\Response\\Elasticsearch)\":{\"*response\":{\"[object] (GuzzleHttp\\Psr7\\Response)\":{\"GuzzleHttp\\Psr7\\ResponsereasonPhrase\":\"OK\",\"GuzzleHttp\\Psr7\\ResponsestatusCode\":200,\"GuzzleHttp\\Psr7\\Responseheaders\":[],\"GuzzleHttp\\Psr7\\ResponseheaderNames\":[],\"GuzzleHttp\\Psr7\\Responseprotocol\":\"1.1\",\"GuzzleHttp\\Psr7\\Responsestream\":\"[object] (GuzzleHttp\\Psr7\\Stream)\"}}}}"}}
{"reqId":"5SX4qoTiPy5iS9tMVEzS","level":0,"time":"2023-09-19T11:30:04+02:00","remoteAddr":"","user":"--","app":"fulltextsearch_elasticsearch","method":"","url":"/occ","message":"Search Result","userAgent":"--","version":"27.1.0.7","data":{"app":"fulltextsearch_elasticsearch","searchResult":"{\"[object] (OCA\\FullTextSearch\\Model\\SearchResult)\":{\"OCA\\FullTextSearch\\Model\\SearchResultdocuments\":[],\"OCA\\FullTextSearch\\Model\\SearchResultrawResult\":\"{\\\"took\\\":1,\\\"timed_out\\\":false,\\\"_shards\\\":{\\\"total\\\":1,\\\"successful\\\":1,\\\"skipped\\\":0,\\\"failed\\\":0},\\\"hits\\\":{\\\"total\\\":{\\\"value\\\":0,\\\"relation\\\":\\\"eq\\\"},\\\"max_score\\\":null,\\\"hits\\\":[]}}\",\"OCA\\FullTextSearch\\Model\\SearchResultprovider\":{\"[object] (OCA\\FullTextSearch\\Provider\\TestProvider)\":{\"OCA\\FullTextSearch\\Provider\\TestProviderconfigService\":\"[object] (OCA\\FullTextSearch\\Service\\ConfigService)\",\"OCA\\FullTextSearch\\Provider\\TestProvidertestService\":\"[object] (OCA\\FullTextSearch\\Service\\TestService)\",\"OCA\\FullTextSearch\\Provider\\TestProvidermiscService\":\"[object] (OCA\\FullTextSearch\\Service\\MiscService)\",\"OCA\\FullTextSearch\\Provider\\TestProviderrunner\":\"[object] (OCA\\FullTextSearch\\Model\\Runner)\",\"OCA\\FullTextSearch\\Provider\\TestProviderindexOptions\":\"[object] (OCA\\FullTextSearch\\Model\\IndexOptions)\"}},\"OCA\\FullTextSearch\\Model\\SearchResultplatform\":{\"[object] (OCA\\FullTextSearch_Elasticsearch\\Platform\\ElasticSearchPlatform)\":{\"OCA\\FullTextSearch_Elasticsearch\\Platform\\ElasticSearchPlatformclient\":\"[object] (Elastic\\Elasticsearch\\Client)\",\"OCA\\FullTextSearch_Elasticsearch\\Platform\\ElasticSearchPlatformrunner\":\"[object] (OCA\\FullTextSearch\\Model\\Runner)\",\"OCA\\FullTextSearch_Elasticsearch\\Platform\\ElasticSearchPlatformconfigService\":\"[object] (OCA\\FullTextSearch_Elasticsearch\\Service\\ConfigService)\",\"OCA\\FullTextSearch_Elasticsearch\\Platform\\ElasticSearchPlatformindexService\":\"[object] (OCA\\FullTextSearch_Elasticsearch\\Service\\IndexService)\",\"OCA\\FullTextSearch_Elasticsearch\\Platform\\ElasticSearchPlatformsearchService\":\"[object] (OCA\\FullTextSearch_Elasticsearch\\Service\\SearchService)\",\"OCA\\FullTextSearch_Elasticsearch\\Platform\\ElasticSearchPlatformlogger\":\"[object] (OC\\AppFramework\\ScopedPsrLogger)\"}},\"OCA\\FullTextSearch\\Model\\SearchResulttotal\":0,\"OCA\\FullTextSearch\\Model\\SearchResultmaxScore\":0,\"OCA\\FullTextSearch\\Model\\SearchResulttime\":1,\"OCA\\FullTextSearch\\Model\\SearchResulttimedOut\":false,\"OCA\\FullTextSearch\\Model\\SearchResultrequest\":{\"[object] (OCA\\FullTextSearch\\Model\\SearchRequest)\":{\"OCA\\FullTextSearch\\Model\\SearchRequestproviders\":[],\"OCA\\FullTextSearch\\Model\\SearchRequestsearch\":\"test\",\"OCA\\FullTextSearch\\Model\\SearchRequestemptySearch\":false,\"OCA\\FullTextSearch\\Model\\SearchRequestpage\":1,\"OCA\\FullTextSearch\\Model\\SearchRequestsize\":10,\"OCA\\FullTextSearch\\Model\\SearchRequestauthor\":\"\",\"OCA\\FullTextSearch\\Model\\SearchRequesttags\":[],\"metaTags\":[],\"subTags\":[],\"OCA\\FullTextSearch\\Model\\SearchRequestoptions\":[],\"OCA\\FullTextSearch\\Model\\SearchRequestparts\":[],\"OCA\\FullTextSearch\\Model\\SearchRequestfields\":[],\"OCA\\FullTextSearch\\Model\\SearchRequestlimitFields\":[],\"OCA\\FullTextSearch\\Model\\SearchRequestwildcardFields\":[],\"OCA\\FullTextSearch\\Model\\SearchRequestwildcardFilters\":[],\"OCA\\FullTextSearch\\Model\\SearchRequestregexFilters\":[],\"OCA\\FullTextSearch\\Model\\SearchRequestsimpleQueries\":[]}}}}"}}

Same without this PR (no error, alls tests pass):

{"reqId":"juwQCqvwIat5CbsJnNFZ","level":0,"time":"2023-09-19T11:35:44+02:00","remoteAddr":"","user":"--","app":"fulltextsearch_elasticsearch","method":"","url":"/occ","message":"Headers: {\"Host\":[\"localhost:9200\"],\"Accept\":[\"application\\/vnd.elasticsearch+json; compatible-with=8\"],\"Content-Type\":[\"application\\/vnd.elasticsearch+json; compatible-with=8\"],\"User-Agent\":[\"elasticsearch-php\\/8.6.1 (Linux 6.2.16-6-pve; PHP 8.1.2-1ubuntu2.14)\"],\"x-elastic-client-meta\":[\"es=8.6.1,php=8.1.2,t=8.7.0,a=0,gu=7.7.0\"]}\nBody: {\"query\":{\"bool\":{\"must\":{\"bool\":{\"should\":[{\"match_phrase_prefix\":{\"content\":\"test\"}},{\"match_phrase_prefix\":{\"title\":\"test\"}}]}},\"filter\":[{\"bool\":{\"must\":{\"term\":{\"provider\":\"test_provider\"}}}},{\"bool\":{\"should\":[{\"term\":{\"owner\":\"user1\"}},{\"term\":{\"users\":\"user1\"}},{\"term\":{\"users\":\"__all\"}}]}},{\"bool\":{\"should\":[]}},{\"bool\":{\"must\":[]}},{\"bool\":{\"must\":[]}}]}},\"highlight\":{\"fields\":{\"content\":{}},\"pre_tags\":[\"\"],\"post_tags\":[\"\"]}}","userAgent":"--","version":"27.1.0.7","data":{"app":"fulltextsearch_elasticsearch"}}
{"reqId":"juwQCqvwIat5CbsJnNFZ","level":1,"time":"2023-09-19T11:35:44+02:00","remoteAddr":"","user":"--","app":"fulltextsearch_elasticsearch","method":"","url":"/occ","message":"Response (retry 0): 200","userAgent":"--","version":"27.1.0.7","data":{"app":"fulltextsearch_elasticsearch","response":"{\"[object] (GuzzleHttp\\Psr7\\Response)\":{\"GuzzleHttp\\Psr7\\ResponsereasonPhrase\":\"OK\",\"GuzzleHttp\\Psr7\\ResponsestatusCode\":200,\"GuzzleHttp\\Psr7\\Responseheaders\":{\"X-elastic-product\":[\"Elasticsearch\"],\"content-type\":[\"application/vnd.elasticsearch+json;compatible-with=8\"],\"Transfer-Encoding\":[\"chunked\"]},\"GuzzleHttp\\Psr7\\ResponseheaderNames\":{\"x-elastic-product\":\"X-elastic-product\",\"content-type\":\"content-type\",\"transfer-encoding\":\"Transfer-Encoding\"},\"GuzzleHttp\\Psr7\\Responseprotocol\":\"1.1\",\"GuzzleHttp\\Psr7\\Responsestream\":{\"[object] (GuzzleHttp\\Psr7\\Stream)\":{\"GuzzleHttp\\Psr7\\Streamstream\":\"[resource] Resource id #1895\",\"GuzzleHttp\\Psr7\\Streamsize\":null,\"GuzzleHttp\\Psr7\\Streamseekable\":true,\"GuzzleHttp\\Psr7\\Streamreadable\":true,\"GuzzleHttp\\Psr7\\Streamwritable\":true,\"GuzzleHttp\\Psr7\\Streamuri\":\"php://temp\",\"GuzzleHttp\\Psr7\\StreamcustomMetadata\":[]}}}}","retry":"0"}}
{"reqId":"juwQCqvwIat5CbsJnNFZ","level":0,"time":"2023-09-19T11:35:44+02:00","remoteAddr":"","user":"--","app":"fulltextsearch_elasticsearch","method":"","url":"/occ","message":"Headers: {\"X-elastic-product\":[\"Elasticsearch\"],\"content-type\":[\"application\\/vnd.elasticsearch+json;compatible-with=8\"],\"Transfer-Encoding\":[\"chunked\"]}\nBody: {\"took\":7,\"timed_out\":false,\"_shards\":{\"total\":1,\"successful\":1,\"skipped\":0,\"failed\":0},\"hits\":{\"total\":{\"value\":1,\"relation\":\"eq\"},\"max_score\":7.7150726,\"hits\":[{\"_index\":\"nextcloud\",\"_id\":\"test_provider:simple\",\"_score\":7.7150726,\"_source\":{\"owner\":\"user1\",\"users\":[],\"groups\":[],\"circles\":[],\"links\":[],\"metatags\":[],\"subtags\":[],\"tags\":[],\"hash\":\"dc5617141771b9472dcc0739960bf07a\",\"provider\":\"test_provider\",\"source\":\"\",\"title\":\"\",\"parts\":[]},\"highlight\":{\"content\":[\"testing document is a simple test\"]}}]}}","userAgent":"--","version":"27.1.0.7","data":{"app":"fulltextsearch_elasticsearch"}}
{"reqId":"juwQCqvwIat5CbsJnNFZ","level":1,"time":"2023-09-19T11:35:44+02:00","remoteAddr":"","user":"--","app":"fulltextsearch_elasticsearch","method":"","url":"/occ","message":"Response time in 0.020 sec","userAgent":"--","version":"27.1.0.7","data":{"app":"fulltextsearch_elasticsearch"}}
{"reqId":"juwQCqvwIat5CbsJnNFZ","level":0,"time":"2023-09-19T11:35:44+02:00","remoteAddr":"","user":"--","app":"fulltextsearch_elasticsearch","method":"","url":"/occ","message":"result from ES","userAgent":"--","version":"27.1.0.7","data":{"app":"fulltextsearch_elasticsearch","result":"{\"[object] (Elastic\\Elasticsearch\\Response\\Elasticsearch)\":{\"*response\":{\"[object] (GuzzleHttp\\Psr7\\Response)\":{\"GuzzleHttp\\Psr7\\ResponsereasonPhrase\":\"OK\",\"GuzzleHttp\\Psr7\\ResponsestatusCode\":200,\"GuzzleHttp\\Psr7\\Responseheaders\":[],\"GuzzleHttp\\Psr7\\ResponseheaderNames\":[],\"GuzzleHttp\\Psr7\\Responseprotocol\":\"1.1\",\"GuzzleHttp\\Psr7\\Responsestream\":\"[object] (GuzzleHttp\\Psr7\\Stream)\"}}}}"}}
{"reqId":"juwQCqvwIat5CbsJnNFZ","level":0,"time":"2023-09-19T11:35:44+02:00","remoteAddr":"","user":"--","app":"fulltextsearch_elasticsearch","method":"","url":"/occ","message":"Search Result","userAgent":"--","version":"27.1.0.7","data":{"app":"fulltextsearch_elasticsearch","searchResult":"{\"[object] (OCA\\FullTextSearch\\Model\\SearchResult)\":{\"OCA\\FullTextSearch\\Model\\SearchResultdocuments\":[{\"[object] (OC\\FullTextSearch\\Model\\IndexDocument)\":[]}],\"OCA\\FullTextSearch\\Model\\SearchResultrawResult\":\"{\\\"took\\\":7,\\\"timed_out\\\":false,\\\"_shards\\\":{\\\"total\\\":1,\\\"successful\\\":1,\\\"skipped\\\":0,\\\"failed\\\":0},\\\"hits\\\":{\\\"total\\\":{\\\"value\\\":1,\\\"relation\\\":\\\"eq\\\"},\\\"max_score\\\":7.7150726,\\\"hits\\\":[{\\\"_index\\\":\\\"nextcloud\\\",\\\"_id\\\":\\\"test_provider:simple\\\",\\\"_score\\\":7.7150726,\\\"_source\\\":{\\\"owner\\\":\\\"user1\\\",\\\"users\\\":[],\\\"groups\\\":[],\\\"circles\\\":[],\\\"links\\\":[],\\\"metatags\\\":[],\\\"subtags\\\":[],\\\"tags\\\":[],\\\"hash\\\":\\\"dc5617141771b9472dcc0739960bf07a\\\",\\\"provider\\\":\\\"test_provider\\\",\\\"source\\\":\\\"\\\",\\\"title\\\":\\\"\\\",\\\"parts\\\":[]},\\\"highlight\\\":{\\\"content\\\":[\\\"testing document is a simple test\\\"]}}]}}\",\"OCA\\FullTextSearch\\Model\\SearchResultprovider\":{\"[object] (OCA\\FullTextSearch\\Provider\\TestProvider)\":{\"OCA\\FullTextSearch\\Provider\\TestProviderconfigService\":\"[object] (OCA\\FullTextSearch\\Service\\ConfigService)\",\"OCA\\FullTextSearch\\Provider\\TestProvidertestService\":\"[object] (OCA\\FullTextSearch\\Service\\TestService)\",\"OCA\\FullTextSearch\\Provider\\TestProvidermiscService\":\"[object] (OCA\\FullTextSearch\\Service\\MiscService)\",\"OCA\\FullTextSearch\\Provider\\TestProviderrunner\":\"[object] (OCA\\FullTextSearch\\Model\\Runner)\",\"OCA\\FullTextSearch\\Provider\\TestProviderindexOptions\":\"[object] (OCA\\FullTextSearch\\Model\\IndexOptions)\"}},\"OCA\\FullTextSearch\\Model\\SearchResultplatform\":{\"[object] (OCA\\FullTextSearch_Elasticsearch\\Platform\\ElasticSearchPlatform)\":{\"OCA\\FullTextSearch_Elasticsearch\\Platform\\ElasticSearchPlatformclient\":\"[object] (Elastic\\Elasticsearch\\Client)\",\"OCA\\FullTextSearch_Elasticsearch\\Platform\\ElasticSearchPlatformrunner\":\"[object] (OCA\\FullTextSearch\\Model\\Runner)\",\"OCA\\FullTextSearch_Elasticsearch\\Platform\\ElasticSearchPlatformconfigService\":\"[object] (OCA\\FullTextSearch_Elasticsearch\\Service\\ConfigService)\",\"OCA\\FullTextSearch_Elasticsearch\\Platform\\ElasticSearchPlatformindexService\":\"[object] (OCA\\FullTextSearch_Elasticsearch\\Service\\IndexService)\",\"OCA\\FullTextSearch_Elasticsearch\\Platform\\ElasticSearchPlatformsearchService\":\"[object] (OCA\\FullTextSearch_Elasticsearch\\Service\\SearchService)\",\"OCA\\FullTextSearch_Elasticsearch\\Platform\\ElasticSearchPlatformlogger\":\"[object] (OC\\AppFramework\\ScopedPsrLogger)\"}},\"OCA\\FullTextSearch\\Model\\SearchResulttotal\":1,\"OCA\\FullTextSearch\\Model\\SearchResultmaxScore\":7,\"OCA\\FullTextSearch\\Model\\SearchResulttime\":7,\"OCA\\FullTextSearch\\Model\\SearchResulttimedOut\":false,\"OCA\\FullTextSearch\\Model\\SearchResultrequest\":{\"[object] (OCA\\FullTextSearch\\Model\\SearchRequest)\":{\"OCA\\FullTextSearch\\Model\\SearchRequestproviders\":[],\"OCA\\FullTextSearch\\Model\\SearchRequestsearch\":\"test\",\"OCA\\FullTextSearch\\Model\\SearchRequestemptySearch\":false,\"OCA\\FullTextSearch\\Model\\SearchRequestpage\":1,\"OCA\\FullTextSearch\\Model\\SearchRequestsize\":10,\"OCA\\FullTextSearch\\Model\\SearchRequestauthor\":\"\",\"OCA\\FullTextSearch\\Model\\SearchRequesttags\":[],\"metaTags\":[],\"subTags\":[],\"OCA\\FullTextSearch\\Model\\SearchRequestoptions\":[],\"OCA\\FullTextSearch\\Model\\SearchRequestparts\":[],\"OCA\\FullTextSearch\\Model\\SearchRequestfields\":[],\"OCA\\FullTextSearch\\Model\\SearchRequestlimitFields\":[],\"OCA\\FullTextSearch\\Model\\SearchRequestwildcardFields\":[],\"OCA\\FullTextSearch\\Model\\SearchRequestwildcardFilters\":[],\"OCA\\FullTextSearch\\Model\\SearchRequestregexFilters\":[],\"OCA\\FullTextSearch\\Model\\SearchRequestsimpleQueries\":[]}}}}"}}

I can just see that there's no hit if this PR is applied. Does the query look as expected?

R0Wi commented 1 year ago

Yes both queries look exactly the same, except the filter part which has been changed from

{
    "bool": {
        "should": [
            {
                "term": {
                    "owner": "user1"
                }
            },
            {
                "term": {
                    "users": "user1"
                }
            },
            {
                "term": {
                    "users": "__all"
                }
            }
        ]
    }
}

to

{
    "bool": {
        "should": [
            {
                "term": {
                    "owner.keyword": "user1"
                }
            },
            {
                "term": {
                    "users.keyword": "user1"
                }
            },
            {
                "term": {
                    "users.keyword": "__all"
                }
            }
        ]
    }
}

which is expected. @XueSheng-GIT would you mind sharing the ES index _mapping-info? You can get it by curl http://localhost:9200/<index_name>/_mapping?pretty. I saw some earlier problems came from old document metadata being stored in the ES index.

XueSheng-GIT commented 1 year ago

/This is the _mapping info:

{
  "nextcloud" : {
    "mappings" : {
      "dynamic" : "true",
      "properties" : {
        "attachment" : {
          "properties" : {
            "author" : {
              "type" : "text",
              "fields" : {
                "keyword" : {
                  "type" : "keyword",
                  "ignore_above" : 256
                }
              }
            },
            "content_length" : {
              "type" : "long"
            },
            "content_type" : {
              "type" : "text",
              "fields" : {
                "keyword" : {
                  "type" : "keyword",
                  "ignore_above" : 256
                }
              }
            },
            "creator_tool" : {
              "type" : "text",
              "fields" : {
                "keyword" : {
                  "type" : "keyword",
                  "ignore_above" : 256
                }
              }
            },
            "date" : {
              "type" : "date"
            },
            "format" : {
              "type" : "text",
              "fields" : {
                "keyword" : {
                  "type" : "keyword",
                  "ignore_above" : 256
                }
              }
            },
            "keywords" : {
              "type" : "text",
              "fields" : {
                "keyword" : {
                  "type" : "keyword",
                  "ignore_above" : 256
                }
              }
            },
            "language" : {
              "type" : "text",
              "fields" : {
                "keyword" : {
                  "type" : "keyword",
                  "ignore_above" : 256
                }
              }
            },
            "modified" : {
              "type" : "date"
            },
            "title" : {
              "type" : "text",
              "fields" : {
                "keyword" : {
                  "type" : "keyword",
                  "ignore_above" : 256
                }
              }
            }
          }
        },
        "circles" : {
          "type" : "keyword"
        },
        "combined" : {
          "type" : "text",
          "term_vector" : "with_positions_offsets",
          "analyzer" : "analyzer"
        },
        "content" : {
          "type" : "text",
          "copy_to" : [
            "combined"
          ],
          "term_vector" : "with_positions_offsets",
          "analyzer" : "analyzer"
        },
        "groups" : {
          "type" : "keyword"
        },
        "hash" : {
          "type" : "keyword"
        },
        "links" : {
          "type" : "keyword"
        },
        "metatags" : {
          "type" : "keyword"
        },
        "owner" : {
          "type" : "keyword"
        },
        "parts" : {
          "properties" : {
            "comments" : {
              "type" : "text",
              "fields" : {
                "keyword" : {
                  "type" : "keyword",
                  "ignore_above" : 256
                }
              }
            },
            "ocr" : {
              "type" : "text",
              "fields" : {
                "keyword" : {
                  "type" : "keyword",
                  "ignore_above" : 256
                }
              }
            }
          }
        },
        "provider" : {
          "type" : "keyword"
        },
        "share_names" : {
          "properties" : {
            "XXX\\" : {
              "properties" : {
                "XXX" : {
                  "type" : "text",
                  "fields" : {
                    "keyword" : {
                      "type" : "keyword",
                      "ignore_above" : 256
                    }
                  }
                }
              }
            },
            "XXX\\" : {
              "properties" : {
                "XXX" : {
                  "type" : "text",
                  "fields" : {
                    "keyword" : {
                      "type" : "keyword",
                      "ignore_above" : 256
                    }
                  }
                }
              }
            },
            "XXX\\" : {
              "properties" : {
                "XXX" : {
                  "type" : "text",
                  "fields" : {
                    "keyword" : {
                      "type" : "keyword",
                      "ignore_above" : 256
                    }
                  }
                }
              }
            },
            "XXX\\" : {
              "properties" : {
                "XXX" : {
                  "type" : "text",
                  "fields" : {
                    "keyword" : {
                      "type" : "keyword",
                      "ignore_above" : 256
                    }
                  }
                }
              }
            },
            "XXX" : {
              "type" : "text",
              "fields" : {
                "keyword" : {
                  "type" : "keyword",
                  "ignore_above" : 256
                }
              }
            },
            "XXX\\" : {
              "properties" : {
                "XXX" : {
                  "type" : "text",
                  "fields" : {
                    "keyword" : {
                      "type" : "keyword",
                      "ignore_above" : 256
                    }
                  }
                }
              }
            },
            "XXX\\" : {
              "properties" : {
                "XXX" : {
                  "type" : "text",
                  "fields" : {
                    "keyword" : {
                      "type" : "keyword",
                      "ignore_above" : 256
                    }
                  }
                },
                "XXX" : {
                  "type" : "text",
                  "fields" : {
                    "keyword" : {
                      "type" : "keyword",
                      "ignore_above" : 256
                    }
                  }
                }
              }
            },
            "XXX\\" : {
              "properties" : {
                "XXX" : {
                  "type" : "text",
                  "fields" : {
                    "keyword" : {
                      "type" : "keyword",
                      "ignore_above" : 256
                    }
                  }
                }
              }
            },
            "XXX\\" : {
              "properties" : {
                "XXX" : {
                  "type" : "text",
                  "fields" : {
                    "keyword" : {
                      "type" : "keyword",
                      "ignore_above" : 256
                    }
                  }
                }
              }
            },
            "XXX\\" : {
              "properties" : {
                "XXX" : {
                  "type" : "text",
                  "fields" : {
                    "keyword" : {
                      "type" : "keyword",
                      "ignore_above" : 256
                    }
                  }
                }
              }
            },
            "XXX\\" : {
              "properties" : {
                "XXX" : {
                  "type" : "text",
                  "fields" : {
                    "keyword" : {
                      "type" : "keyword",
                      "ignore_above" : 256
                    }
                  }
                }
              }
            },
            "XXX\\" : {
              "properties" : {
                "XXX" : {
                  "type" : "text",
                  "fields" : {
                    "keyword" : {
                      "type" : "keyword",
                      "ignore_above" : 256
                    }
                  }
                }
              }
            },
            "XXX\\" : {
              "properties" : {
                "XXX" : {
                  "type" : "text",
                  "fields" : {
                    "keyword" : {
                      "type" : "keyword",
                      "ignore_above" : 256
                    }
                  }
                }
              }
            },
            "XXX\\" : {
              "properties" : {
                "XXX" : {
                  "type" : "text",
                  "fields" : {
                    "keyword" : {
                      "type" : "keyword",
                      "ignore_above" : 256
                    }
                  }
                }
              }
            },
            "XXX\\" : {
              "properties" : {
                "XXX" : {
                  "type" : "text",
                  "fields" : {
                    "keyword" : {
                      "type" : "keyword",
                      "ignore_above" : 256
                    }
                  }
                }
              }
            },
            "XXX\\" : {
              "properties" : {
                "XXX" : {
                  "type" : "text",
                  "fields" : {
                    "keyword" : {
                      "type" : "keyword",
                      "ignore_above" : 256
                    }
                  }
                }
              }
            },
            "XXX\\" : {
              "properties" : {
                "XXX" : {
                  "type" : "text",
                  "fields" : {
                    "keyword" : {
                      "type" : "keyword",
                      "ignore_above" : 256
                    }
                  }
                }
              }
            },
            "XXX\\" : {
              "properties" : {
                "XXX" : {
                  "type" : "text",
                  "fields" : {
                    "keyword" : {
                      "type" : "keyword",
                      "ignore_above" : 256
                    }
                  }
                }
              }
            },
            "XXX" : {
              "type" : "text",
              "fields" : {
                "keyword" : {
                  "type" : "keyword",
                  "ignore_above" : 256
                }
              }
            }
          }
        },
        "source" : {
          "type" : "keyword"
        },
        "subtags" : {
          "type" : "keyword"
        },
        "tags" : {
          "type" : "keyword"
        },
        "title" : {
          "type" : "text",
          "copy_to" : [
            "combined"
          ],
          "term_vector" : "with_positions_offsets",
          "analyzer" : "keyword"
        },
        "users" : {
          "type" : "keyword"
        }
      }
    }
  }
R0Wi commented 1 year ago

Thanks. This looks entirely correct to me. I will try to do some tests with the latest ES 8.10, since I'm using 8.6.1. I don't think this should make any difference but let's see ...

vbier commented 1 year ago

This does not look like my mapping. The respective fields are of type keyword, whereas my fields are of type text and have the keyword subfield.

        "users" : {
          "type" : "text",
          "fields" : {
            "keyword" : {
              "type" : "keyword",
              "ignore_above" : 256
            }
          }
        }
R0Wi commented 1 year ago

Good spot @vbier ! Indeed mine looks the same and for example users is a combination of keyword and text. Same goes for owner and all other fields discussed here. @XueSheng-GIT how big is your index? Would it be possible to rebuild it from scratch?

ArtificialOwl commented 1 year ago

/backport to stable27

ArtificialOwl commented 1 year ago

/backport to stable26

ArtificialOwl commented 1 year ago

Nice work and, well, thanks for your patience :-]

XueSheng-GIT commented 1 year ago

Good spot @vbier ! Indeed mine looks the same and for example users is a combination of keyword and text. Same goes for owner and all other fields discussed here. @XueSheng-GIT how big is your index? Would it be possible to rebuild it from scratch?

I started a rebuild of the index (stop, delete, reset, index). After it was finished, it now started to rebuild again. Could be related to https://github.com/nextcloud/fulltextsearch/issues/767 and https://github.com/nextcloud/fulltextsearch/issues/723. I'll add my comments over there and add a new issue if required.

Thanks for your help.

XueSheng-GIT commented 1 year ago

I was now able to do a "quick" index with disabled tesseract. Mapping looks now like shown above https://github.com/nextcloud/fulltextsearch_elasticsearch/pull/237#issuecomment-1725905169 and test runs without issues. Thanks @vbier and @R0Wi for your help.