hubmapconsortium / search-api

HuBMAP search service and associated pieces to create an index
https://search.api.hubmapconsortium.org
MIT License
2 stars 2 forks source link

New reindex-all script feature with dedicated logging #781

Closed yuanzhou closed 1 week ago

yuanzhou commented 5 months ago

This PUT /reindex-all was added at the early phase of search-api. As we have a lot more data now, it no longer serves the desired purpose using a REST API call to handle the reindex that may take a few hours to complete. The lack of dedicated logging is another major reason we should deprecate it.

Add new reindex-all scripting feature and generate dedicated logging to a separate file. The scripting feature should also handle the following use cases:

For both cases, we'll want to have a statistic feature that tracks the failed entity uuids.

Eventually we'll be able to deprecate the PUT /reindex-all endpoint.

yuanzhou commented 1 month ago

8/1/2024, @kburke proposed this blue/green deployment procedure which would accommodate all the requirements with dedicated logging and it'll also minimize the downtime and incomplete data state that the current "Full Reindex" poses.

yuanzhou commented 1 month ago

8/15/2024, branched off @kburke's branch with mounting scripts to container and tested on DEV with create command.

[2024-08-15 22:06:16] INFO in hubmap_translator:286: ############# Finished executing translate_full() at 22:06:16. #############
[2024-08-15 22:06:16] INFO in hubmap_translator:290: ############# Executing translate_full() took 07:24:56. #############
[2024-08-15 22:06:16] INFO in fresh_indices:231: ############# Full index via script complete at 22:06:16 #############
[2024-08-15 22:06:16] INFO in fresh_indices:234: ############# Full index via script took 07:24:57. #############
Screenshot 2024-08-16 at 9 22 51 AM Screenshot 2024-08-16 at 9 23 07 AM

I created a new Sample HBM985.PKTG.328 on DEV to be used to verify the "catch up" output, it generated an error though:

Command to execute: 'catch-up'
Output will be in ./exec_info

[2024-08-16 12:03:20] INFO in fresh_indices:55: logger initialized with effective logging level 20.
[2024-08-16 12:03:20] INFO in fresh_indices:59: The fill strategy to be executed is create_fill.
[2024-08-16 12:03:20] INFO in fresh_indices:363: ESWriter initialized with URL https://search-hubmap-dev-test-hfnqv4ylo5ywvc42vwnyptbup4.us-east-1.es.amazonaws.com
[2024-08-16 12:03:20] INFO in fresh_indices:365: ESManager initialized with URL https://search-hubmap-dev-test-hfnqv4ylo5ywvc42vwnyptbup4.us-east-1.es.amazonaws.com
[2024-08-16 12:03:20] DEBUG in connectionpool:1021: Starting new HTTPS connection (1): search-hubmap-dev-test-hfnqv4ylo5ywvc42vwnyptbup4.us-east-1.es.amazonaws.com:443
[2024-08-16 12:03:20] DEBUG in connectionpool:474: https://search-hubmap-dev-test-hfnqv4ylo5ywvc42vwnyptbup4.us-east-1.es.amazonaws.com:443 "HEAD /fill20240815_fresh_index_hm_dev_public_entities HTTP/1.1" 200 0
[2024-08-16 12:03:20] DEBUG in connectionpool:1021: Starting new HTTPS connection (1): search-hubmap-dev-test-hfnqv4ylo5ywvc42vwnyptbup4.us-east-1.es.amazonaws.com:443
[2024-08-16 12:03:20] DEBUG in connectionpool:474: https://search-hubmap-dev-test-hfnqv4ylo5ywvc42vwnyptbup4.us-east-1.es.amazonaws.com:443 "HEAD /hm_dev_public_entities HTTP/1.1" 200 0
[2024-08-16 12:03:20] DEBUG in connectionpool:1021: Starting new HTTPS connection (1): search-hubmap-dev-test-hfnqv4ylo5ywvc42vwnyptbup4.us-east-1.es.amazonaws.com:443
[2024-08-16 12:03:20] DEBUG in connectionpool:474: https://search-hubmap-dev-test-hfnqv4ylo5ywvc42vwnyptbup4.us-east-1.es.amazonaws.com:443 "HEAD /fill20240815_fresh_index_hm_dev_consortium_entities HTTP/1.1" 200 0
[2024-08-16 12:03:20] DEBUG in connectionpool:1021: Starting new HTTPS connection (1): search-hubmap-dev-test-hfnqv4ylo5ywvc42vwnyptbup4.us-east-1.es.amazonaws.com:443
[2024-08-16 12:03:20] DEBUG in connectionpool:474: https://search-hubmap-dev-test-hfnqv4ylo5ywvc42vwnyptbup4.us-east-1.es.amazonaws.com:443 "HEAD /hm_dev_consortium_entities HTTP/1.1" 200 0
[2024-08-16 12:03:20] DEBUG in connectionpool:1021: Starting new HTTPS connection (1): search-hubmap-dev-test-hfnqv4ylo5ywvc42vwnyptbup4.us-east-1.es.amazonaws.com:443
[2024-08-16 12:03:20] DEBUG in connectionpool:474: https://search-hubmap-dev-test-hfnqv4ylo5ywvc42vwnyptbup4.us-east-1.es.amazonaws.com:443 "HEAD /fill20240815_fresh_index_hm_dev_public_portal HTTP/1.1" 200 0
[2024-08-16 12:03:20] DEBUG in connectionpool:1021: Starting new HTTPS connection (1): search-hubmap-dev-test-hfnqv4ylo5ywvc42vwnyptbup4.us-east-1.es.amazonaws.com:443
[2024-08-16 12:03:20] DEBUG in connectionpool:474: https://search-hubmap-dev-test-hfnqv4ylo5ywvc42vwnyptbup4.us-east-1.es.amazonaws.com:443 "HEAD /hm_dev_public_portal HTTP/1.1" 200 0
[2024-08-16 12:03:20] DEBUG in connectionpool:1021: Starting new HTTPS connection (1): search-hubmap-dev-test-hfnqv4ylo5ywvc42vwnyptbup4.us-east-1.es.amazonaws.com:443
[2024-08-16 12:03:20] DEBUG in connectionpool:474: https://search-hubmap-dev-test-hfnqv4ylo5ywvc42vwnyptbup4.us-east-1.es.amazonaws.com:443 "HEAD /fill20240815_fresh_index_hm_dev_consortium_portal HTTP/1.1" 200 0
[2024-08-16 12:03:20] DEBUG in connectionpool:1021: Starting new HTTPS connection (1): search-hubmap-dev-test-hfnqv4ylo5ywvc42vwnyptbup4.us-east-1.es.amazonaws.com:443
[2024-08-16 12:03:20] DEBUG in connectionpool:474: https://search-hubmap-dev-test-hfnqv4ylo5ywvc42vwnyptbup4.us-east-1.es.amazonaws.com:443 "HEAD /hm_dev_consortium_portal HTTP/1.1" 200 0
Not ready to catch-up activity which took place in the Production indices during or after creation of the new indices
But if we will be once we:
*** reindex these uuids for documents which failed while creating the new indices:
Traceback (most recent call last):
  File "/usr/src/app/scripts/fresh_indices/fresh_indices.py", line 392, in <module>
    catch_up_new_index()
  File "/usr/src/app/scripts/fresh_indices/fresh_indices.py", line 318, in catch_up_new_index
    print(f"****** {op_data['translator_failed_entity_ids']}", file=sys.stderr)
KeyError: 'translator_failed_entity_ids'

The actual output file exec_info/op_data_20240815.json

{
  "index": {
    "hm_dev_public_entities": {
      "destination": "fill20240815_fresh_index_hm_dev_public_entities",
      "max": {
        "last_modified_timestamp": 1723123892224.0,
        "created_timestamp": 1713809391616.0
      },
      "initial_doc_count": 4619
    },
    "hm_dev_consortium_entities": {
      "destination": "fill20240815_fresh_index_hm_dev_consortium_entities",
      "max": {
        "last_modified_timestamp": 1723645165568.0,
        "created_timestamp": 1723645165568.0
      },
      "initial_doc_count": 53505
    },
    "hm_dev_public_portal": {
      "destination": "fill20240815_fresh_index_hm_dev_public_portal",
      "max": {
        "last_modified_timestamp": 1723123892224.0,
        "created_timestamp": 1713809391616.0
      },
      "initial_doc_count": 4618
    },
    "hm_dev_consortium_portal": {
      "destination": "fill20240815_fresh_index_hm_dev_consortium_portal",
      "max": {
        "last_modified_timestamp": 1723645165568.0,
        "created_timestamp": 1723645165568.0
      },
      "initial_doc_count": 53285
    }
  },
  "file_time_prefix": "20240815"
}