hubmapconsortium / search-api

HuBMAP search service and associated pieces to create an index
https://search.api.hubmapconsortium.org
MIT License
2 stars 2 forks source link

Verify index results using separate search indices #776

Closed yuanzhou closed 5 months ago

yuanzhou commented 6 months ago

Once we finish the card https://github.com/hubmapconsortium/search-api/issues/756, we'll need to test and verify this new entity-api endpoint GET /documents/<id> is getting all the necessary fields into Elasticsearch via the index procedure. And switching to this new endpoint will not cause any issues to the portal-ui rendering.

Create two pairs of indices in the AWS OpenSearch DEV-TEST cluster, to separate from the current DEV/TEST indices.

Configure the local search-api instance against the current DEV entity-api instance to fetch the data via /documents/<id> and add documents to the above indices via the individual reindex call PUT /reindex/<id>. Then compare the results between the current DEV indices and these new indices.

Once the local verification is completed and everything works as expected, Zhou will deploy the search-api onto DEV and trigger a full reindex then we'll ask the Harvard team to confirm.

kburke commented 6 months ago

I think this QDSL query presents an interesting opportunity to compare the two indices. But for me, it currently returns {"Message":"Your request: '/_transform/_preview' is not allowed."} I will need to investigate if that is yet another ElasticSearch/OpenSearch difference, and AWS constraint, or something else, if we can go this way. Will spot check & do things another way to get started, though...

yuanzhou commented 6 months ago

It's not supported by the AWS managed version: https://docs.aws.amazon.com/opensearch-service/latest/developerguide/supported-operations.html#version_7_10