FREVA-CLINT / freva

The Free Evaluation System Framework (FreVa)
Other
10 stars 3 forks source link

Solr connection issue on `freva` docker image #209

Open MoSHad91 opened 1 week ago

MoSHad91 commented 1 week ago

When I was running the latest version of Freva docker image, I faced an issue related to ingesting dummy data into Solr.

$ docker run  ghcr.io/freva-clint/freva:2406.0.1@sha256:0a056cb05a9153f95bca5250387665c79e30ac2d2fa7e69327fab2407a87faaa

Logs/Output:

neither jattach nor jstack in /opt/java/openjdk could be found, so no thread dumps are possible. Continuing.
Java 17 detected. Enabled workaround for SOLR-16463
Waiting up to 180 seconds to see Solr running on port 8983 [|]  
Started Solr server on port 8983 (pid=179). Happy searching!

Traceback (most recent call last):
  File "/opt/evaluation_system/ingest_dummy_data.py", line 17, in <module>
    SolrCore.load_fs(inp_data, abort_on_errors=True)
  File "/opt/evaluation_system/lib/python3.12/site-packages/evaluation_system/model/solr_core.py", line 344, in load_fs
    core_latest._del_file_pattern(input_dir)
  File "/opt/evaluation_system/lib/python3.12/site-packages/evaluation_system/model/solr_core.py", line 271, in _del_file_pattern
    self.delete(f"{prefix}:\\{file_pattern}")
  File "/opt/evaluation_system/lib/python3.12/site-packages/evaluation_system/model/solr_core.py", line 235, in delete
    self.post(dict(delete=dict(query=query)), auto_list=False)
  File "/opt/evaluation_system/lib/python3.12/site-packages/evaluation_system/model/solr_core.py", line 97, in post
    return urllib.request.urlopen(req).read()
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/evaluation_system/lib/python3.12/urllib/request.py", line 215, in urlopen
    return opener.open(url, data, timeout)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/evaluation_system/lib/python3.12/urllib/request.py", line 521, in open
    response = meth(req, response)
               ^^^^^^^^^^^^^^^^^^^
  File "/opt/evaluation_system/lib/python3.12/urllib/request.py", line 630, in http_response
    response = self.parent.error(
               ^^^^^^^^^^^^^^^^^^
  File "/opt/evaluation_system/lib/python3.12/urllib/request.py", line 559, in error
    return self._call_chain(*args)
           ^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/evaluation_system/lib/python3.12/urllib/request.py", line 492, in _call_chain
    result = func(*args)
             ^^^^^^^^^^^
  File "/opt/evaluation_system/lib/python3.12/urllib/request.py", line 639, in http_error_default
    raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 503: Service Unavailable

Root cause:

The issue appears to originate from the following lines in docker-entrypoint: https://github.com/FREVA-CLINT/freva/blob/main/.docker/docker-entrypoint.sh#L26-L27

After this point, the connection to Solr is lost, which affects the data ingestion process. Notably, when I export the Solr port, I can verify that Solr is up and running.

soluition:

A potential solution could be to implement a Solr connection validator in Dockerfile before ingesting dummy data into Solr. This would help catch connectivity issues earlier in the pipeline, before moving to image production.