Closed equinor-ruaj closed 9 months ago
Test: run the same Drogon_Design case 2 times with Komodo bleeding and 2 time with Komodo stable and check consistency. Expected result: all equal.
**SUMMARY: There is an issue in STABLE that loses some uploads due to an httpx 'timed out' error. This issue is not found in BLEEDING.
Bleeding had one other error that should be looked into: WARNING:msal_extensions.cache_lock:Process 30540 failed to create lock file WARNING:py.warnings:/prog/res/komodo/bleeding-py38-rhel7/root/lib/python3.8/site-packages/fmu/sumo/uploader/scripts/sumo_upload.py:119: UserWarning: Problem related to Sumo upload: [Errno 11] Resource temporarily unavailable **END SUMMARY
Komodo STABLE: fmu-sumo 0.5.1 fmu-sumo-sim2sumo 0.0.0 sumo-wrapper-python 0.4.0
Komodo BLEEDING: fmu-sumo 1.0.1 fmu-sumo-sim2sumo 0.1.2.dev1+g3d81a7d fmu-sumo-uploader 1.0.2.dev2+g0c326e2 sumo-wrapper-python 1.0.2.dev3+g928ba7c
STABLE -> PROD: Uploads differs. Around 20 'yellow' circles aka missing objects in both uploads. Each of these have the following in their log-files in /scratch: real-3: sumo_upload.py:119: UserWarning: Problem related to Sumo upload: timed out real-7: sumo_upload.py:119: UserWarning: Problem related to Sumo upload: timed out
BLEEDING -> DEV: First 2 uploads had no 'yellow' circles at all, and exact same number of objects. Third upload had 2 'yellow' circles: real-29: Oct 26 13:09: Metadata: [502] Bad Gateway: RADIX DEV DEPLOYMENT AT THIS TIME. real-132: Oct 26 13:35: WARNING:msal_extensions.cache_lock:Process 30540 failed to create lock file WARNING:py.warnings:/prog/res/komodo/bleeding-py38-rhel7/root/lib/python3.8/site-packages/fmu/sumo/uploader/scripts/sumo_upload.py:119: UserWarning: Problem related to Sumo upload: [Errno 11] Resource temporarily unavailable
4th upload: no yellow , bleeding_skip_sim2sumo_04
BLEEDING -> PROD: /scratch/fmu/rowh/bleeding_skip_sim2sumo_06 (1800) 1 run: No yellow
STABLE -> DEV: /scratch/fmu/rowh/stable_skip_sim2sumo_07/realization-
MODIFIED STABLE -> PROD to get stacktrace of 'timed out' error: /scratch/fmu/rowh/stable_skip_sim2sumo_08/realization-
The above exception was the direct cause of the following exception:
Traceback (most recent call last): File "/prog/res/komodo/2023.10.01-py38-rhel7/root/lib/python3.8/site-packages/httpx/_transports/default.py", line 60, in map_httpcore_exceptions yield File "/prog/res/komodo/2023.10.01-py38-rhel7/root/lib/python3.8/site-packages/httpx/_transports/default.py", line 218, in handle_request resp = self._pool.handle_request(req) File "/prog/res/komodo/2023.10.01-py38-rhel7/root/lib/python3.8/site-packages/httpcore/_sync/connection_pool.py", line 262, in handle_request raise exc File "/prog/res/komodo/2023.10.01-py38-rhel7/root/lib/python3.8/site-packages/httpcore/_sync/connection_pool.py", line 245, in handle_request response = connection.handle_request(request) File "/prog/res/komodo/2023.10.01-py38-rhel7/root/lib/python3.8/site-packages/httpcore/_sync/http_proxy.py", line 271, in handle_request connect_response = self._connection.handle_request( File "/prog/res/komodo/2023.10.01-py38-rhel7/root/lib/python3.8/site-packages/httpcore/_sync/connection.py", line 96, in handle_request return self._connection.handle_request(request) File "/prog/res/komodo/2023.10.01-py38-rhel7/root/lib/python3.8/site-packages/httpcore/_sync/http11.py", line 121, in handle_request raise exc File "/prog/res/komodo/2023.10.01-py38-rhel7/root/lib/python3.8/site-packages/httpcore/_sync/http11.py", line 99, in handle_request ) = self._receive_response_headers(**kwargs) File "/prog/res/komodo/2023.10.01-py38-rhel7/root/lib/python3.8/site-packages/httpcore/_sync/http11.py", line 164, in _receive_response_headers event = self._receive_event(timeout=timeout) File "/prog/res/komodo/2023.10.01-py38-rhel7/root/lib/python3.8/site-packages/httpcore/_sync/http11.py", line 200, in _receive_event data = self._network_stream.read( File "/prog/res/komodo/2023.10.01-py38-rhel7/root/lib/python3.8/site-packages/httpcore/_backends/sync.py", line 28, in read return self._sock.recv(max_bytes) File "/opt/rh/rh-python38/root/usr/lib64/python3.8/contextlib.py", line 131, in exit self.gen.throw(type, value, traceback) File "/prog/res/komodo/2023.10.01-py38-rhel7/root/lib/python3.8/site-packages/httpcore/_exceptions.py", line 14, in map_exceptions raise to_exc(exc) from exc httpcore.ReadTimeout: timed out
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/private/rowh/thEvnStable/root/bin/sumo_upload", line 8, in
bleeding_mon_a (dev) no deployments in radix dev today: 3 'yellow' dotted realizations:
real-3: #97: Oct 30 14:14 Metadata: [404] Not Found: /scratch/fmu/rowh/bleeding_mon_a/realization-3/iter-0/share/results/polygons/topvolon--gl_faultlines_extract_postprocess.csv But both this file and its companion yml file exists on disk. A 404 is maybe a response from sumo-core?
real-37: #96: Oct 30 14:17 Metadata: [502] Bad Gateway: Filepath: /scratch/fmu/rowh/bleeding_mon_a/realization-37/iter-0/share/results/maps/therys--phit_average.gri
real-88: #100: Oct 30 14:20 Metadata: [502] Bad Gateway: /scratch/fmu/rowh/bleeding_mon_a/realization-88/iter-0/share/observations/seismic/seismic--relai_depth--20200701_20190701.segy
Sample case with differences across iterations: link