nextcloud / fulltextsearch_elasticsearch

🔍 Use Elasticsearch to index the content of your Nextcloud
GNU Affero General Public License v3.0
81 stars 30 forks source link

File contents are not being indexed, and tests are failing #376

Open idressos opened 3 months ago

idressos commented 3 months ago

Nextcloud Version: Nextcloud Hub 8 (29.0.3) ElasticSearch Version: 8.14.1 OS: Debian 12 x64 Authentication Provider: xpack

Installation specifications:

What is wrong?

I can see through Kibana that documents are being indexed (more than 31000 at this point and still increasing), however when I use the full text search page, no matter what string I search for, no results are returned everytime in less than 2ms. Obviously I am searching for strings that I know are included in the documents.

Also, through Kibana, for ALL document entries I can see that the contents field is ALWAYS empty no matter whay kind of document it is or what the provider is (see screenshot).

image

The app settings are adjusted to index everything everywhere.

The fulltextsearch:test OCC command returns the following:

.Testing your current setup:  
Creating mocked content provider. ok  
Testing mocked provider: get indexable documents. (2 items) ok  
Loading search platform. (Elasticsearch) ok  
Testing search platform. fail 
In Test.php line 304:

  Search platform (Elasticsearch) down ?  

fulltextsearch:test [--output [OUTPUT]] [-j|--json] [-d|--platform_delay PLATFORM_DELAY]

cURL command output to STATS endpoint:

Identifying information has been redacted, the rest of the response is left intact.

root@hostname ~ # curl -vvv -u username:password "https://elasticsearch.example.com:9200/index/_stats?pretty"
*   Trying [IP_ADDRESS]:9200...
* Connected to elasticsearch.example.com (IP_ADDRESS) port 9200 (#0)
* ALPN: offers h2,http/1.1
* TLSv1.3 (OUT), TLS handshake, Client hello (1):
*  CAfile: /etc/ssl/certs/ca-certificates.crt
*  CApath: /etc/ssl/certs
* TLSv1.3 (IN), TLS handshake, Server hello (2):
* TLSv1.3 (IN), TLS handshake, Encrypted Extensions (8):
* TLSv1.3 (IN), TLS handshake, Certificate (11):
* TLSv1.3 (IN), TLS handshake, CERT verify (15):
* TLSv1.3 (IN), TLS handshake, Finished (20):
* TLSv1.3 (OUT), TLS change cipher, Change cipher spec (1):
* TLSv1.3 (OUT), TLS handshake, Finished (20):
* SSL connection using TLSv1.3 / TLS_AES_256_GCM_SHA384
* ALPN: server did not agree on a protocol. Uses default.
* Server certificate:
*  subject: CN=elasticsearch.example.com
*  start date: DATE
*  expire date: DATE
*  subjectAltName: host "elasticsearch.example.com" matched cert's "elasticsearch.example.com"
*  issuer: C=COUNTRY; O=ORGANIZATION; CN=CERT_COMMON_NAME
*  SSL certificate verify ok.
* using HTTP/1.x
* Server auth using Basic with user 'username'
> GET /index/_stats?pretty HTTP/1.1
> Host: elasticsearch.example.com:9200
> Authorization: Basic AUTH_TOKEN
> User-Agent: curl/VERSION
> Accept: */*
> 
* TLSv1.3 (IN), TLS handshake, Newsession Ticket (4):
< HTTP/1.1 200 OK
< X-elastic-product: Elasticsearch
< content-type: application/json
< Transfer-Encoding: chunked
< 
{
  "_shards" : {
    "total" : 2,
    "successful" : 2,
    "failed" : 0
  },
  "_all" : {
    "primaries" : {
      "docs" : {
        "count" : 31332,
        "deleted" : 468,
        "total_size_in_bytes" : 129553984
      },
      "shard_stats" : {
        "total_count" : 2
      },
      "store" : {
        "size_in_bytes" : 129555933,
        "total_data_set_size_in_bytes" : 129555933,
        "reserved_in_bytes" : 0
      },
      "indexing" : {
        "index_total" : 11617,
        "index_time_in_millis" : 66515,
        "index_current" : 0,
        "index_failed" : 0,
        "delete_total" : 4013,
        "delete_time_in_millis" : 1005,
        "delete_current" : 0,
        "noop_update_total" : 0,
        "is_throttled" : false,
        "throttle_time_in_millis" : 0,
        "write_load" : 1.0231181063895236E-4
      },
      "get" : {
        "total" : 0,
        "time_in_millis" : 0,
        "exists_total" : 0,
        "exists_time_in_millis" : 0,
        "missing_total" : 0,
        "missing_time_in_millis" : 0,
        "current" : 0
      },
      "search" : {
        "open_contexts" : 0,
        "query_total" : 1244,
        "query_time_in_millis" : 21451,
        "query_current" : 0,
        "fetch_total" : 165,
        "fetch_time_in_millis" : 112,
        "fetch_current" : 0,
        "scroll_total" : 0,
        "scroll_time_in_millis" : 0,
        "scroll_current" : 0,
        "suggest_total" : 0,
        "suggest_time_in_millis" : 0,
        "suggest_current" : 0
      },
      "merges" : {
        "current" : 0,
        "current_docs" : 0,
        "current_size_in_bytes" : 0,
        "total" : 42,
        "total_time_in_millis" : 46289,
        "total_docs" : 43305,
        "total_size_in_bytes" : 247974581,
        "total_stopped_time_in_millis" : 0,
        "total_throttled_time_in_millis" : 0,
        "total_auto_throttle_in_bytes" : 41943040
      },
      "refresh" : {
        "total" : 714,
        "total_time_in_millis" : 6041,
        "external_total" : 250,
        "external_total_time_in_millis" : 2614,
        "listeners" : 0
      },
      "flush" : {
        "total" : 448,
        "periodic" : 448,
        "total_time_in_millis" : 44667,
        "total_time_excluding_waiting_on_lock_in_millis" : 45081
      },
      "warmer" : {
        "current" : 0,
        "total" : 248,
        "total_time_in_millis" : 44
      },
      "query_cache" : {
        "memory_size_in_bytes" : 0,
        "total_count" : 0,
        "hit_count" : 0,
        "miss_count" : 0,
        "cache_size" : 0,
        "cache_count" : 0,
        "evictions" : 0
      },
      "fielddata" : {
        "memory_size_in_bytes" : 0,
        "evictions" : 0,
        "global_ordinals" : {
          "build_time_in_millis" : 0
        }
      },
      "completion" : {
        "size_in_bytes" : 0
      },
      "segments" : {
        "count" : 13,
        "memory_in_bytes" : 0,
        "terms_memory_in_bytes" : 0,
        "stored_fields_memory_in_bytes" : 0,
        "term_vectors_memory_in_bytes" : 0,
        "norms_memory_in_bytes" : 0,
        "points_memory_in_bytes" : 0,
        "doc_values_memory_in_bytes" : 0,
        "index_writer_memory_in_bytes" : 0,
        "version_map_memory_in_bytes" : 0,
        "fixed_bit_set_memory_in_bytes" : 0,
        "max_unsafe_auto_id_timestamp" : -1,
        "file_sizes" : { }
      },
      "translog" : {
        "operations" : 0,
        "size_in_bytes" : 110,
        "uncommitted_operations" : 0,
        "uncommitted_size_in_bytes" : 110,
        "earliest_last_modified_age" : 2462773
      },
      "request_cache" : {
        "memory_size_in_bytes" : 3552,
        "evictions" : 0,
        "hit_count" : 542,
        "miss_count" : 28
      },
      "recovery" : {
        "current_as_source" : 0,
        "current_as_target" : 0,
        "throttle_time_in_millis" : 0
      },
      "bulk" : {
        "total_operations" : 15630,
        "total_time_in_millis" : 68568,
        "total_size_in_bytes" : 27082584,
        "avg_time_in_millis" : 8,
        "avg_size_in_bytes" : 3743
      },
      "dense_vector" : {
        "value_count" : 0
      }
    },
    "total" : {
      "docs" : {
        "count" : 31332,
        "deleted" : 468,
        "total_size_in_bytes" : 129553984
      },
      "shard_stats" : {
        "total_count" : 2
      },
      "store" : {
        "size_in_bytes" : 129555933,
        "total_data_set_size_in_bytes" : 129555933,
        "reserved_in_bytes" : 0
      },
      "indexing" : {
        "index_total" : 11617,
        "index_time_in_millis" : 66515,
        "index_current" : 0,
        "index_failed" : 0,
        "delete_total" : 4013,
        "delete_time_in_millis" : 1005,
        "delete_current" : 0,
        "noop_update_total" : 0,
        "is_throttled" : false,
        "throttle_time_in_millis" : 0,
        "write_load" : 1.0231181063895236E-4
      },
      "get” : {
        “total” : 0,
        “time_in_millis” : 0,
        “exists_total” : 0,
        “exists_time_in_millis” : 0,
        “missing_total” : 0,
        “missing_time_in_millis” : 0,
        “current” : 0
        },
       “search” : {
        “open_contexts” : 0,
        “query_total” : 1244,
        “query_time_in_millis” : 21451,
        “query_current” : 0,
        “fetch_total” : 165,
        “fetch_time_in_millis” : 112,
        “fetch_current” : 0,
        “scroll_total” : 0,
        “scroll_time_in_millis” : 0,
        “scroll_current” : 0,
        “suggest_total” : 0,
        “suggest_time_in_millis” : 0,
        “suggest_current” : 0
        },
       “merges” : {
        “current” : 0,
        “current_docs” : 0,
        “current_size_in_bytes” : 0,
        “total” : 42,
        “total_time_in_millis” : 46289,
        “total_docs” : 43305,
        “total_size_in_bytes” : 247974581,
        “total_stopped_time_in_millis” : 0,
        “total_throttled_time_in_millis” : 0,
        “total_auto_throttle_in_bytes” : 41943040
        },
       “refresh” : {
        “total” : 714,
        “total_time_in_millis” : 6041,
        “external_total” : 250,
        “external_total_time_in_millis” : 2614,
        “listeners” : 0
        },
       “flush” : {
        “total” : 448,
        “periodic” : 448,
        “total_time_in_millis” : 44667,
        “total_time_excluding_waiting_on_lock_in_millis” : 45081
        },
       “warmer” : {
        “current” : 0,
        “total” : 248,
        “total_time_in_millis” : 44
        },
       “query_cache” : {
        “memory_size_in_bytes” : 0,
        “total_count” : 0,
        “hit_count” : 0,
        “miss_count” : 0,
        “cache_size” : 0,
        “cache_count” : 0,
        “evictions” : 0
        },
       “fielddata” : {
        “memory_size_in_bytes” : 0,
        “evictions” : 0,
        “global_ordinals” : {
        “build_time_in_millis” : 0
        }
        },
       “completion” : {
        “size_in_bytes” : 0
        },
       “segments” : {
        “count” : 13,
        “memory_in_bytes” : 0,
        “terms_memory_in_bytes” : 0,
        “stored_fields_memory_in_bytes” : 0,
        “term_vectors_memory_in_bytes” : 0,
        “norms_memory_in_bytes” : 0,
        “points_memory_in_bytes” : 0,
        “doc_values_memory_in_bytes” : 0,
        “index_writer_memory_in_bytes” : 0,
        “version_map_memory_in_bytes” : 0,
        “fixed_bit_set_memory_in_bytes” : 0,
        “max_unsafe_auto_id_timestamp” : -1,
        “file_sizes” : { }
        },
       “translog” : {
        “operations” : 0,
        “size_in_bytes” : 110,
        “uncommitted_operations” : 0,
        “uncommitted_size_in_bytes” : 110,
        “earliest_last_modified_age” : 2462773
        },
       “request_cache” : {
        “memory_size_in_bytes” : 3552,
        “evictions” : 0,
        “hit_count” : 542,
        “miss_count” : 28
        },
       “recovery” : {
        “current_as_source” : 0,
        “current_as_target” : 0,
        “throttle_time_in_millis” : 0
        },
       “bulk” : {
        “total_operations” : 15630,
        “total_time_in_millis” : 68568,
        “total_size_in_bytes” : 27082584,
        “avg_time_in_millis” : 8,
        “avg_size_in_bytes” : 3743
        },
       “dense_vector” : {
        “value_count” : 0
       }
      }
     }
    }

Connection #0 to host elasticsearch.example.com left intact
idressos commented 3 months ago

@Matthias-Ab