For the integration test with ElasticSearch it would be great to know when data was indexed

linagora / twake-drive

The open-source alternative to Google Drive.

GNU Affero General Public License v3.0

92 stars 18 forks source link

For the integration test with ElasticSearch it would be great to know when data was indexed #64

Open shepilov opened 1 year ago

shepilov commented 1 year ago

One of the ways to do it is explicitly make a call to ElasticSearch. Before that we need to add a pipeline to Elastic, to set this date after the doc was indexed https://www.elastic.co/guide/en/elasticsearch/reference/7.10/accessing-data-in-pipelines.html#accessing-ingest-metadata And then you can check your doc by id

Another way is to add a listener to the Elastic bulk operations and set timeout according to https://www.elastic.co/guide/en/elasticsearch/reference/current/near-real-time.html

RomaricMourgues commented 1 year ago

Isn’t it simpler to add a delay of 2 seconds after the bulk operation ? When the data is small like in e2e tests 2s is way more than enough to make sure the data is indexed and ready for retrievalOn 1 Jun 2023, at 22:10, Anton Shepilov @.***> wrote: One of the ways to do it is explicitly make a call to ElasticSearch. Before that we need to add a pipeline to Elastic, to set this date after the doc was indexed https://www.elastic.co/guide/en/elasticsearch/reference/7.10/accessing-data-in-pipelines.html#accessing-ingest-metadata And then you can check your doc by id Another way is to add a listener to the Elastic bulk operations and set timeout according to https://www.elastic.co/guide/en/elasticsearch/reference/current/near-real-time.html

—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you are subscribed to this thread.Message ID: @.***>

chibenwa commented 1 year ago

In Apache James we flush the index after indexing CF https://github.com/apache/james-project/blob/a0fbd09941ac18ec6c22d739ecbcce8d83fea6d6/backends-common/opensearch/src/test/java/org/apache/james/backends/opensearch/DockerOpenSearch.java#L332

BTW we NEED to align the search engine on OpenSearch.

shepilov commented 1 year ago

Isn’t it simpler to add a delay of 2 seconds after the bulk operation ? When the data is small like in e2e tests 2s is way more than enough to make sure the data is indexed and ready for retrieval @RomaricMourgues Yes, I've maid right now like this, but it's not stable, one of 10 runs fails

shepilov commented 1 year ago

In Apache James we flush the index after indexing CF https://github.com/apache/james-project/blob/a0fbd09941ac18ec6c22d739ecbcce8d83fea6d6/backends-common/opensearch/src/test/java/org/apache/james/backends/opensearch/DockerOpenSearch.java#L332

BTW we NEED to align the search engine on OpenSearch.

thanks @chibenwa do you have the helm script for OpenSearch? and is the flush API method synchronous?

chibenwa commented 1 year ago

thanks @chibenwa do you have the helm script for OpenSearch?

Ops chose to deploy DBs outside of K8S using old-fashionned ansible deployments. Persistance volumes is somewhat of a complex topic... And of course we have an ansible deployment available.

Cc @ducnm0711

and is the flush API method synchronous?

Yes

ducnm0711 commented 1 year ago

we can spinup OpenSearch helm chart in tdrive-dev cluster if you just need to test the API for response. OpenSearch v2.7.0 is ok or we have other specific version requirement?

shepilov commented 1 year ago

we can spinup OpenSearch helm chart in tdrive-dev cluster if you just need to test the API for response. OpenSearch v2.7.0 is ok or we have other specific version requirement?

yes, it should be fine