nasa / opera-sds-pcm

Observational Products for End-Users from Remote Sensing Analysis (OPERA)
Apache License 2.0
16 stars 12 forks source link

[Bug]: SCIFLO PGE errors with no such index [grq]', grq, index_or_alias when it's first run on a fresh cluster #953

Open philipjyoon opened 1 month ago

philipjyoon commented 1 month ago

Checked for duplicates

Yes - I've already checked

Describe the bug

After SCILFO is run postprocess_evaluator.py expecting for at least one index named "grq.. something" to exist. On a fresh cluster it does not exist and the first SCIFLO run always fails.

This will succeed the second time because the failure generates a grq_xxx_triage_job ES and the second time that condition is true.

But there's no reason why the system must assume that any grq* index should exist in order for it to store the product. If it doesn't exist, it could create it first and then store the product instead of failing.

This error happened on DSWx-S1 product generation but I suspect it applies to most other products.

Exception Type: <class 'RuntimeError'>
Exception Value: Post processor failure: NotFoundError(404, 'index_not_found_exception', 'no such index [grq]', grq, index_or_alias)
Traceback (most recent call last):
  File "/home/ops/verdi/ops/chimera/chimera/postprocess_evaluator.py", line 115, in process
    cls_object.run(
  File "/home/ops/verdi/ops/chimera/chimera/postprocess_functions.py", line 46, in run
    self._job_result.update(getattr(self, func)())
  File "/home/ops/verdi/ops/opera-pcm/opera_chimera/postprocess_functions.py", line 28, in update_product_accountability
    self.accountability.set_products(self._job_result)
  File "/home/ops/verdi/ops/opera-pcm/opera_chimera/accountability.py", line 235, in set_products
    self.update_product_met_json(job_result=job_results)
  File "/home/ops/verdi/ops/opera-pcm/opera_chimera/accountability.py", line 209, in update_product_met_json
    old_accountability = self.flatten_and_merge_accountability()
  File "/home/ops/verdi/ops/opera-pcm/opera_chimera/accountability.py", line 138, in flatten_and_merge_accountability
    entries = self.get_entries()
  File "/home/ops/verdi/ops/opera-pcm/opera_chimera/accountability.py", line 131, in get_entries
    results = grq_es.query(body={
  File "/home/ops/verdi/ops/hysds_commons-1.0.16/hysds_commons/elasticsearch_utils.py", line 217, in query
    data = self.es.search(**kwargs)
  File "/home/ops/verdi/lib/python3.9/site-packages/elasticsearch/client/utils.py", line 168, in _wrapped
    return func(*args, params=params, headers=headers, **kwargs)
  File "/home/ops/verdi/lib/python3.9/site-packages/elasticsearch/client/__init__.py", line 1670, in search
    return self.transport.perform_request(
  File "/home/ops/verdi/lib/python3.9/site-packages/elasticsearch/transport.py", line 415, in perform_request
    raise e
  File "/home/ops/verdi/lib/python3.9/site-packages/elasticsearch/transport.py", line 381, in perform_request
    status, headers_response, data = connection.perform_request(
  File "/home/ops/verdi/lib/python3.9/site-packages/elasticsearch/connection/http_urllib3.py", line 277, in perform_request
    self._raise_error(response.status, raw_data)
  File "/home/ops/verdi/lib/python3.9/site-packages/elasticsearch/connection/base.py", line 330, in _raise_error
    raise HTTP_EXCEPTIONS.get(status_code, TransportError)(
elasticsearch.exceptions.NotFoundError: NotFoundError(404, 'index_not_found_exception', 'no such index [grq]', grq, index_or_alias)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/ops/verdi/ops/sciflo-1.3.6/sciflo/grid/workUnit.py", line 130, in run
    result = self._run()
  File "/home/ops/verdi/ops/sciflo-1.3.6/sciflo/grid/workUnit.py", line 172, in _run
    return func(*funcArgs)
  File "/tmp/scifloWork-ops/sciflowuid-56344721-08009600-9472152024-05abd61610f02a48372505971649982c/post_processor.py", line 41, in post_process
    output_context = post_processor.process()
  File "/home/ops/verdi/ops/chimera/chimera/postprocess_evaluator.py", line 126, in process
    raise RuntimeError("Post processor failure: {}".format(e))
RuntimeError: Post processor failure: NotFoundError(404, 'index_not_found_exception', 'no such index [grq]', grq, index_or_alias)

What did you expect?

I expected [...]

Reproducible steps

1.
2.
3.
...

Environment

- Version of this software [e.g. vX.Y.Z]
- Operating System: [e.g. MacOSX with Docker Desktop vX.Y]
...