ai4d-iasc / trixie

Scripts and documentation about trixie hpc
17 stars 4 forks source link

Ansys lumerical after upgrade #78

Closed nrcfieldsa closed 2 years ago

nrcfieldsa commented 2 years ago

Ansys lumerical software package experiences an MPI error after upgrade of cluster, using CC CVMS StdEnv/2020 and missing dependencies. Researcher reports the job did not continue/complete.

nrcfieldsa commented 2 years ago

The Intel MPI shared libraries are not found at runtime with lumerical/2021-r2 (v212) cpi-impi-lcl test. However, the main job seems to work after removing the test.

/opt/lumerical/v212/mpitest/cpi-impi-lcl: error while loading share libraries: libbmpi.2o.12: cannot open shared object file: No such file or directory

It was possible to run the Ansys lumerical with older StdEnv/2018.3 gcc/7.3.0 and IntelMPI. A successful batch job (run_optim-v212_new.sh) completed showing both adjoint and forward solves working, converge and exit 0.

nrcfieldsa commented 2 years ago

The current version of Ansys Lumerical is 2022r1.3 and it would be possible upon request to install it to re-attempt with latest Intel MPI library version. However, it is possible to run the job to complete with-out stacktrace by using the older module load lines.

Running the job with the latest environment StdEnv/2020 on it's own does not permit the job to complete and it gets a python error:

Traceback (most recent call last):
  File "lumoptGrating3D.py", line 86, in <module>
    opt = runAdjoint(gcScript, parOpt)
[..]
Traceback (most recent call last):
  File "/cvmfs/soft.computecanada.ca/easybuild/software/2020/avx512/Core/scipy-stack/2022a/lib/python3.8/site-packages/matplotlib/animation.py", line 444, in __del__
AttributeError: 'snapshots' object ahs no attribute '_tmpdir'
nrcfieldsa commented 2 years ago

A new job file has been prepared and an email will be sent to reporter of this issue.