pyiron / pyiron_atomistics

pyiron_atomistics - an integrated development environment (IDE) for atomistic simulation in computational materials science.
https://pyiron-atomistics.readthedocs.io
BSD 3-Clause "New" or "Revised" License
42 stars 15 forks source link

"finished" lammps jobs show "aborted" status in latest kernel, but not in old kernel #1376

Closed lfzhu-phys closed 4 months ago

lfzhu-phys commented 4 months ago

The jobs crashed at the last stage when collecting the data. When running the jobs in old kernel, this issue is not there. Thanks in advance for looking into it.

The error.out is like following:

----------------------------------------------------------------------------------------------
2024-04-09 10:48:46,590 - pyiron_log - INFO - run job: ham_langevin_3_01 id: 21848603, status: collect
Traceback (most recent call last):
  File "/cmmc/ptmp/pyironhb/mambaforge/envs/pyiron_latest/lib/python3.10/runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/cmmc/ptmp/pyironhb/mambaforge/envs/pyiron_latest/lib/python3.10/runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "/cmmc/ptmp/pyironhb/mambaforge/envs/pyiron_latest/lib/python3.10/site-packages/pyiron_base/cli/__main__.py", line 3, in <module>
    main()
.................
File "/cmmc/ptmp/pyironhb/mambaforge/envs/pyiron_latest/lib/python3.10/site-packages/pyiron_base/storage/hdfio.py", line 231, in __getitem__
    raise ValueError(
ValueError: Unknown item: mlip_inp /cmmc/u/lfzhu/TOR-TILD/TaVCrW/LDA/solid/liquid_approach/step3/Integration_P_over_V_solid_2676K/ham_langevin_3_01.h5 /ham_langevin_3_01/input
-----------------------------------------------------------------------------------------------
pmrv commented 4 months ago

Ah, yes, this is a known problem and also fixed already in the latest version of pyiron_potentialfit, here, but haven't gotten to updating everything yet, sorry for that. I'll update you here once this is published on conda and updated on the cluster.

Also, the mlip jobs where moved from contrib to pyiron_potentialfit, so you'll have to update your older jobs as detailed here.

lfzhu-phys commented 4 months ago

@pmrv Thanks a lot. Will look into the details and updated the jobs:)

pmrv commented 4 months ago

@lfzhu-phys The pyiron/latest kernel is updated on the cluster now. You will have to rerun the failed calculations, though. :|

lfzhu-phys commented 4 months ago

Great, just did a test run, it works now. Thanks a lot.