equinor / ert

ERT - Ensemble based Reservoir Tool - is designed for running ensembles of dynamical models such as reservoir models, in order to do sensitivity analysis and data assimilation. ERT supports data assimilation using the Ensemble Smoother (ES), Ensemble Smoother with Multiple Data Assimilation (ES-MDA) and Iterative Ensemble Smoother (IES).
https://ert.readthedocs.io/en/latest/
GNU General Public License v3.0
101 stars 104 forks source link

ecl_run.py does not always find error messages #8650

Open eivindjahren opened 2 weeks ago

eivindjahren commented 2 weeks ago

ecl_run.py will sometimes fail with the following stderr message:

job ECLIPSE100 failed with: 'Process exited with status code 1'
stderr file: '/.../realization-15/pred/ECLIPSE100.stderr.129',
its contents:
Traceback (most recent call last):
File "/.../ert/resources/forward-models/res/script/ecl100.py", line 9, in <module>
run(config, [arg for arg in sys.argv[1:] if len(arg) > 0])
File "/.../ert/resources/forward-models/res/script/ecl_run.py", line 537, in run
run.runEclipse(eclrun_config=eclrun_config)
File "/.../ert/resources/forward-models/res/script/ecl_run.py", line 412, in runEclipse
raise err from None
File "/../ert/resources/forward-models/res/script/ecl_run.py", line 390, in runEclipse
self.assertECLEND()
File "/.../ert/resources/forward-models/res/script/ecl_run.py", line 459, in assertECLEND
raise RuntimeError(
RuntimeError: Eclipse simulation failed with:1 errors:

But no errors are found with self.parseErrors(). Likely the parsing is not correct, but unsure in which way

berland commented 19 hours ago

This requires access to the PRT files of the failed Eclipse runs.

If there is usually a match between result.errors and len(error_list) we can log the PRT filename when it mismatches, and then we need to get access to the file afterwards.

berland commented 19 hours ago

Or possibly, whenever error_list is empty, we could log the last say 2000 bytes of the PRT file.

eivindjahren commented 12 minutes ago

I think both approaches are valid @berland. We could when len(error_list) < result.errors then we print the name of the PRT file and the last characters of it.