Open behnle opened 4 months ago
Totally forgot to mention: Docker is docker 25.0.3 on Rocky 9.3:
[root@host nomad]# docker info
Client: Docker Engine - Community
Version: 25.0.3
Context: default
Debug Mode: false
Plugins:
buildx: Docker Buildx (Docker Inc.)
Version: v0.12.1
Path: /usr/libexec/docker/cli-plugins/docker-buildx
compose: Docker Compose (Docker Inc.)
Version: v2.24.5
Path: /usr/libexec/docker/cli-plugins/docker-compose
Server:
Containers: 12
Running: 7
Paused: 0
Stopped: 5
Images: 14
Server Version: 25.0.3
Storage Driver: overlay2
Backing Filesystem: xfs
Supports d_type: true
Using metacopy: false
Native Overlay Diff: true
userxattr: false
Logging Driver: json-file
Cgroup Driver: systemd
Cgroup Version: 2
Plugins:
Volume: local
Network: bridge host ipvlan macvlan null overlay
Log: awslogs fluentd gcplogs gelf journald json-file local splunk syslog
Swarm: inactive
Runtimes: io.containerd.runc.v2 runc
Default Runtime: runc
Init Binary: docker-init
containerd version: ae07eda36dd25f8a1b98dfbf587313b99c0190bb
runc version: v1.1.12-0-g51d5e94
init version: de40ad0
Security Options:
seccomp
Profile: builtin
cgroupns
Kernel Version: 5.14.0-362.18.1.el9_3.x86_64
Operating System: Rocky Linux 9.3 (Blue Onyx)
OSType: linux
Architecture: x86_64
CPUs: 4
Total Memory: 15.56GiB
Name: u-030-s007
ID: 17341a6c-20f5-4a76-a0fd-8cf7ecddaf09
Docker Root Dir: /dockerdata/volumes
Debug Mode: false
Experimental: false
Insecure Registries:
127.0.0.0/8
Live Restore Enabled: false
Hi @behnle, thanks for reporting. Can you confirm you are talking about the Electronic structure code input and output files
example upload? I have just run this upload in our beta deployment , and the entry with the mainfile Cu2Se/2/aims.out
was taking significantly longer than the others to parse. Eventually, I returned to that upload and the parsing turned out successful and I could visualize the overview cards. @ladinesa, any idea what could be going on here?
I will take a closer look but I suspect the parsing timed out. However, the error seems to be inconsistent with a timed-out entry. It could also be that the archive size is larger than permitted causing trouble with the archive reader.
Exactly, that's the one. Let me know if you need additional informations for tracking down the problem.
I will take a closer look but I suspect the parsing timed out. However, the error seems to be inconsistent with a timed-out entry. It could also be that the archive size is larger than permitted causing trouble with the archive reader. Processing seemed to work fine, there were no processing log errors. The problem occurred when i went to the overview page of the sample, and it happens immediately.
There are only two parser warnings
"root":{
"event":string"Energy not reported for an calculation that is part of a geometry optimization"
"proc":string"Entry"
"process":string"process_entry"
"process_worker_id":string"vO0M_ZzSS-KIdrnfYzbUdw"
"parser":string"parsers/fhi-aims"
"normalizer":string"SimulationWorkflowNormalizer"
"step":string"SimulationWorkflowNormalizer"
"logger":string"nomad.processing"
"timestamp":string"2024-02-07 09:18.51"
"level":string"WARNING"
}
but no errors. In case the issue is related to a timeout, which one would it be and where can i adjust it?
You can modify the settings by specifying them in the nomad.yaml file. You can have a look at the docs here. For a complete list of config keys, I suggest you look at the code under nomad/config/models.py . For example you can adjust services.api_timeout or celery.timeout
I had already skimmed the list of config options and had set services:api_timeout
to 6000 seconds:
services:
# api_host: 'localhost'
api_host: <redacted>
api_port: 443
api_base_path: '/nomad-oasis'
api_timeout: 6000
https: True
https_upload: True
admin_user_id: <redacted> # TODO replace
# aitoolkit_enabled: True
console_log_level: 10
upload_limit: 100000
Did not change any celery settings, though.
can you please send me the image path. i have troubles finding it.
Might be related to this: https://gitlab.mpcdf.mpg.de/nomad-lab/nomad-FAIR/-/issues/1861
Might be related to this: https://gitlab.mpcdf.mpg.de/nomad-lab/nomad-FAIR/-/issues/1861
ah yes I have completely forgotten about this. thanks Lauri
@ladinesa What do You mean by "image path"? The docker image?
[root@host nomad]# docker image ls
REPOSITORY TAG IMAGE ID CREATED SIZE
gitlab-registry.mpcdf.mpg.de/nomad-lab/nomad-fair latest dae1849135eb 6 weeks ago 1.81GB
...
@lauri-codes Yes, might be related to my issue. Would it help to pull a new docker image if available?
@behnle: Sorry I missed your comment. You can try updating the nomad-fair:latest
docker image. If the problem still persists, we need to release a new image with the fix.
@lauri-codes Thanks for the heads-up. I recently pulled the "latest" image:
[root@host nomad]# docker image ls
REPOSITORY TAG IMAGE ID CREATED SIZE
nginx <none> e4720093a3c1 6 weeks ago 187MB
nginx latest 92b11f67642b 6 weeks ago 187MB
gitlab-registry.mpcdf.mpg.de/nomad-lab/nomad-fair latest 279c097945fe 7 weeks ago 1.88GB
gitlab-registry.mpcdf.mpg.de/nomad-lab/nomad-fair <none> dae1849135eb 3 months ago 1.81GB
python latest e7177b0afd0e 3 months ago 1.02GB
gitlab-registry.mpcdf.mpg.de/nomad-lab/nomad-remote-tools-hub/jupyterlab latest f1b5e187ee1e 4 months ago 6.39GB
gitlab-registry.mpcdf.mpg.de/nomad-lab/nomad-remote-tools-hub/jupyterlab prod f1b5e187ee1e 4 months ago 6.39GB
gitlab-registry.mpcdf.mpg.de/nomad-lab/nomad-remote-tools-hub/nexus-webtop latest 548857bf45d9 4 months ago 7.43GB
gitlab-registry.mpcdf.mpg.de/nomad-lab/nomad-remote-tools-hub/apmtools-webtop latest 125e01c59a73 5 months ago 5.29GB
gitlab-registry.mpcdf.mpg.de/nomad-lab/nomad-remote-tools-hub/webtop latest 603c690b7911 5 months ago 1.65GB
gitlab-registry.mpcdf.mpg.de/nomad-lab/nomad-remote-tools-hub/ellips-jupyter latest 4e3e12da664c 5 months ago 6.22GB
gitlab-registry.mpcdf.mpg.de/nomad-lab/nomad-remote-tools-hub/xps-jupyter latest 5bf19c880ab6 5 months ago 5.65GB
nginx <none> a8758716bb6a 5 months ago 187MB
jupyter/datascience-notebook latest f78a42f3bc9a 5 months ago 5.92GB
gitlab-registry.mpcdf.mpg.de/nomad-lab/nomad-fair v1.2.1 cc8dd7c53b3c 6 months ago 1.67GB
rabbitmq 3.11.5 3ddcc140fe5c 15 months ago 228MB
mongo 5.0.6 532c84506200 24 months ago 699MB
docker.elastic.co/elasticsearch/elasticsearch 7.17.1 515ab4fba870 2 years ago 618MB
With this release, every other attempt on the original sample data succeeds, but some reprocessing runs fail with
"errors":string"process failed due to worker lost: Worker exited prematurely: signal 7 (SIGBUS) Job: 19."
"event":string"process failed"
"proc":string"Entry"
"process":string"process_entry"
"process_worker_id":string"N92Rn87uS9usK6o6O4e9eA"
"parser":string"parsers/exciting"
"logger":string"nomad.processing"
"exception":string"Traceback (most recent call last): File "/usr/local/lib/python3.9/site-packages/billiard/pool.py", line 1265, in mark_as_worker_lost raise WorkerLostError( billiard.exceptions.WorkerLostError: Worker exited prematurely: signal 7 (SIGBUS) Job: 19."
"timestamp":string"2024-03-28 14:12.39"
"level":string"ERROR"
}
Traceback (most recent call last):
File "/usr/local/lib/python3.9/site-packages/billiard/pool.py", line 1265, in mark_as_worker_lost
raise WorkerLostError(
billiard.exceptions.WorkerLostError: Worker exited prematurely: signal 7 (SIGBUS) Job: 19.
and the docker-compose log contains error messages like this one
nomad_oasis_worker | 2024-03-28T13:12:39.988471515Z ERROR nomad.processing 2024-03-28T13:12:39 detected WorkerLostError
nomad_oasis_worker | 2024-03-28T13:12:39.988485299Z - exception: Traceback (most recent call last):
nomad_oasis_worker | 2024-03-28T13:12:39.988487333Z File "/usr/local/lib/python3.9/site-packages/billiard/pool.py", line 1265, in mark_as_worker_lost
nomad_oasis_worker | 2024-03-28T13:12:39.988489440Z raise WorkerLostError(
nomad_oasis_worker | 2024-03-28T13:12:39.988491171Z billiard.exceptions.WorkerLostError: Worker exited prematurely: signal 7 (SIGBUS) Job: 19.
nomad_oasis_worker | 2024-03-28T13:12:39.988492930Z - exception_hash: PyYAluNDeMcTSlMqcpc84b96E4SI
nomad_oasis_worker | 2024-03-28T13:12:39.988494543Z - nomad.commit:
nomad_oasis_worker | 2024-03-28T13:12:39.988505434Z - nomad.deployment: oasis
nomad_oasis_worker | 2024-03-28T13:12:39.988511090Z - nomad.service: unknown nomad service
nomad_oasis_worker | 2024-03-28T13:12:39.988513031Z - nomad.version: 1.2.2.dev357+g15b7cd2e1
nomad_oasis_worker | 2024-03-28T13:12:39.993245423Z ERROR nomad.processing 2024-03-28T13:12:39 process failed
nomad_oasis_worker | 2024-03-28T13:12:39.993288548Z - exception: Traceback (most recent call last):
nomad_oasis_worker | 2024-03-28T13:12:39.993291858Z File "/usr/local/lib/python3.9/site-packages/billiard/pool.py", line 1265, in mark_as_worker_lost
nomad_oasis_worker | 2024-03-28T13:12:39.993294068Z raise WorkerLostError(
nomad_oasis_worker | 2024-03-28T13:12:39.993295886Z billiard.exceptions.WorkerLostError: Worker exited prematurely: signal 7 (SIGBUS) Job: 19.
nomad_oasis_worker | 2024-03-28T13:12:39.993297906Z - exception_hash: PyYAluNDeMcTSlMqcpc84b96E4SI
nomad_oasis_worker | 2024-03-28T13:12:39.993300711Z - nomad.commit:
nomad_oasis_worker | 2024-03-28T13:12:39.993302447Z - nomad.deployment: oasis
nomad_oasis_worker | 2024-03-28T13:12:39.993304235Z - nomad.entry_id: pwcCETIYQy1JlgLP5S_E9GHJT7-z
nomad_oasis_worker | 2024-03-28T13:12:39.993312033Z - nomad.mainfile: Sn2Se/1/INFO_GS.OUT
nomad_oasis_worker | 2024-03-28T13:12:39.993332598Z - nomad.processing.errors: process failed due to worker lost: Worker exited prematurely: signal 7 (SIGBUS) Job: 19.
nomad_oasis_worker | 2024-03-28T13:12:39.993344413Z - nomad.processing.logger: nomad.processing
nomad_oasis_worker | 2024-03-28T13:12:39.993347787Z - nomad.processing.parser: parsers/exciting
nomad_oasis_worker | 2024-03-28T13:12:39.993351339Z - nomad.processing.proc: Entry
nomad_oasis_worker | 2024-03-28T13:12:39.993354840Z - nomad.processing.process: process_entry
nomad_oasis_worker | 2024-03-28T13:12:39.993363288Z - nomad.processing.process_status: RUNNING
nomad_oasis_worker | 2024-03-28T13:12:39.993367814Z - nomad.processing.process_worker_id: N92Rn87uS9usK6o6O4e9eA
nomad_oasis_worker | 2024-03-28T13:12:39.993371495Z - nomad.service: unknown nomad service
nomad_oasis_worker | 2024-03-28T13:12:39.993374973Z - nomad.upload_id: nivYbhQbRRKyDaAQ1Yor3g
nomad_oasis_worker | 2024-03-28T13:12:39.993380021Z - nomad.version: 1.2.2.dev357+g15b7cd2e1
In the journal of the server, i found the following potentially related error message:
Mar 28 14:12:38 u-030-s007 systemd-coredump[280150]: [🡕] Process 274915 (python) of user 1000 dumped core.
Module /usr/local/lib/python3.9/site-packages/quippy_ase.libs/libopenblasp-r0-dcce3d0b.3.20.so without build-id.
Module /usr/local/lib/python3.9/site-packages/quippy_ase.libs/libopenblasp-r0-dcce3d0b.3.20.so
Module /usr/local/lib/python3.9/site-packages/quippy/_quippy.cpython-39-x86_64-linux-gnu.so without build-id.
Module /usr/local/lib/python3.9/site-packages/quippy/_quippy.cpython-39-x86_64-linux-gnu.so
Module /usr/local/lib/python3.9/site-packages/MDAnalysis.libs/libgomp-a34b3233.so.1.0.0 without build-id.
Module /usr/local/lib/python3.9/site-packages/MDAnalysis.libs/libgomp-a34b3233.so.1.0.0
Module /usr/local/lib/python3.9/site-packages/scipy/stats/mvn.cpython-39-x86_64-linux-gnu.so without build-id.
Module /usr/local/lib/python3.9/site-packages/scipy/stats/mvn.cpython-39-x86_64-linux-gnu.so
Module /usr/local/lib/python3.9/site-packages/scipy/stats/statlib.cpython-39-x86_64-linux-gnu.so without build-id.
Module /usr/local/lib/python3.9/site-packages/scipy/stats/statlib.cpython-39-x86_64-linux-gnu.so
Module /usr/local/lib/python3.9/site-packages/scipy/special/cython_special.cpython-39-x86_64-linux-gnu.so without build-id.
Module /usr/local/lib/python3.9/site-packages/scipy/special/cython_special.cpython-39-x86_64-linux-gnu.so
Module /usr/local/lib/python3.9/site-packages/scipy/interpolate/dfitpack.cpython-39-x86_64-linux-gnu.so without build-id.
Module /usr/local/lib/python3.9/site-packages/scipy/interpolate/dfitpack.cpython-39-x86_64-linux-gnu.so
Module /usr/local/lib/python3.9/site-packages/scipy/interpolate/_fitpack.cpython-39-x86_64-linux-gnu.so without build-id.
Module /usr/local/lib/python3.9/site-packages/scipy/interpolate/_fitpack.cpython-39-x86_64-linux-gnu.so
Module /usr/local/lib/python3.9/site-packages/scipy/integrate/lsoda.cpython-39-x86_64-linux-gnu.so without build-id.
Module /usr/local/lib/python3.9/site-packages/scipy/integrate/lsoda.cpython-39-x86_64-linux-gnu.so
Module /usr/local/lib/python3.9/site-packages/scipy/integrate/_dop.cpython-39-x86_64-linux-gnu.so without build-id.
Module /usr/local/lib/python3.9/site-packages/scipy/integrate/_dop.cpython-39-x86_64-linux-gnu.so
Module /usr/local/lib/python3.9/site-packages/scipy/integrate/vode.cpython-39-x86_64-linux-gnu.so without build-id.
Module /usr/local/lib/python3.9/site-packages/scipy/integrate/vode.cpython-39-x86_64-linux-gnu.so
Module /usr/local/lib/python3.9/site-packages/scipy/integrate/_quadpack.cpython-39-x86_64-linux-gnu.so without build-id.
Module /usr/local/lib/python3.9/site-packages/scipy/integrate/_quadpack.cpython-39-x86_64-linux-gnu.so
Module /usr/local/lib/python3.9/site-packages/scipy/integrate/_odepack.cpython-39-x86_64-linux-gnu.so without build-id.
Module /usr/local/lib/python3.9/site-packages/scipy/integrate/_odepack.cpython-39-x86_64-linux-gnu.so
Module /usr/local/lib/python3.9/site-packages/scipy/linalg/_interpolative.cpython-39-x86_64-linux-gnu.so without build-id.
Module /usr/local/lib/python3.9/site-packages/scipy/linalg/_interpolative.cpython-39-x86_64-linux-gnu.so
Module /usr/local/lib/python3.9/site-packages/scipy/optimize/__nnls.cpython-39-x86_64-linux-gnu.so without build-id.
Module /usr/local/lib/python3.9/site-packages/scipy/optimize/__nnls.cpython-39-x86_64-linux-gnu.so
Module /usr/local/lib/python3.9/site-packages/scipy/optimize/_minpack.cpython-39-x86_64-linux-gnu.so without build-id.
Module /usr/local/lib/python3.9/site-packages/scipy/optimize/_minpack.cpython-39-x86_64-linux-gnu.so
Module /usr/local/lib/python3.9/site-packages/scipy/optimize/_slsqp.cpython-39-x86_64-linux-gnu.so without build-id.
Module /usr/local/lib/python3.9/site-packages/scipy/optimize/_slsqp.cpython-39-x86_64-linux-gnu.so
Module /usr/local/lib/python3.9/site-packages/scipy/optimize/_cobyla.cpython-39-x86_64-linux-gnu.so without build-id.
Module /usr/local/lib/python3.9/site-packages/scipy/optimize/_cobyla.cpython-39-x86_64-linux-gnu.so
Module /usr/local/lib/python3.9/site-packages/scipy/optimize/_lbfgsb.cpython-39-x86_64-linux-gnu.so without build-id.
Module /usr/local/lib/python3.9/site-packages/scipy/optimize/_lbfgsb.cpython-39-x86_64-linux-gnu.so
Module /usr/local/lib/python3.9/site-packages/scipy/optimize/_trlib/_trlib.cpython-39-x86_64-linux-gnu.so without build-id.
Module /usr/local/lib/python3.9/site-packages/scipy/optimize/_trlib/_trlib.cpython-39-x86_64-linux-gnu.so
Module /usr/local/lib/python3.9/site-packages/scipy/sparse/linalg/eigen/arpack/_arpack.cpython-39-x86_64-linux-gnu.so without build-id.
Module /usr/local/lib/python3.9/site-packages/scipy/sparse/linalg/eigen/arpack/_arpack.cpython-39-x86_64-linux-gnu.so
Module /usr/local/lib/python3.9/site-packages/scipy/sparse/linalg/dsolve/_superlu.cpython-39-x86_64-linux-gnu.so without build-id.
Module /usr/local/lib/python3.9/site-packages/scipy/sparse/linalg/dsolve/_superlu.cpython-39-x86_64-linux-gnu.so
Module /usr/local/lib/python3.9/site-packages/scipy/sparse/linalg/isolve/_iterative.cpython-39-x86_64-linux-gnu.so without build-id.
Module /usr/local/lib/python3.9/site-packages/scipy/sparse/linalg/isolve/_iterative.cpython-39-x86_64-linux-gnu.so
Module /usr/local/lib/python3.9/site-packages/scipy/optimize/minpack2.cpython-39-x86_64-linux-gnu.so without build-id.
Module /usr/local/lib/python3.9/site-packages/scipy/optimize/minpack2.cpython-39-x86_64-linux-gnu.so
Module /usr/local/lib/python3.9/site-packages/scipy/spatial/qhull.cpython-39-x86_64-linux-gnu.so without build-id.
Module /usr/local/lib/python3.9/site-packages/scipy/spatial/qhull.cpython-39-x86_64-linux-gnu.so
Module /usr/local/lib/python3.9/site-packages/h5py/_selector.cpython-39-x86_64-linux-gnu.so without build-id.
Module /usr/local/lib/python3.9/site-packages/h5py/_selector.cpython-39-x86_64-linux-gnu.so
Module /usr/local/lib/python3.9/site-packages/h5py/h5l.cpython-39-x86_64-linux-gnu.so without build-id.
Module /usr/local/lib/python3.9/site-packages/h5py/h5l.cpython-39-x86_64-linux-gnu.so
Module /usr/local/lib/python3.9/site-packages/h5py/h5o.cpython-39-x86_64-linux-gnu.so without build-id.
Module /usr/local/lib/python3.9/site-packages/h5py/h5o.cpython-39-x86_64-linux-gnu.so
Module /usr/local/lib/python3.9/site-packages/h5py/h5pl.cpython-39-x86_64-linux-gnu.so without build-id.
Module /usr/local/lib/python3.9/site-packages/h5py/h5pl.cpython-39-x86_64-linux-gnu.so
Module /usr/local/lib/python3.9/site-packages/h5py/h5fd.cpython-39-x86_64-linux-gnu.so without build-id.
Module /usr/local/lib/python3.9/site-packages/h5py/h5fd.cpython-39-x86_64-linux-gnu.so
Module /usr/local/lib/python3.9/site-packages/h5py/h5i.cpython-39-x86_64-linux-gnu.so without build-id.
Module /usr/local/lib/python3.9/site-packages/h5py/h5i.cpython-39-x86_64-linux-gnu.so
Module /usr/local/lib/python3.9/site-packages/h5py/h5g.cpython-39-x86_64-linux-gnu.so without build-id.
Module /usr/local/lib/python3.9/site-packages/h5py/h5g.cpython-39-x86_64-linux-gnu.so
Module /usr/local/lib/python3.9/site-packages/h5py/h5f.cpython-39-x86_64-linux-gnu.so without build-id.
Module /usr/local/lib/python3.9/site-packages/h5py/h5f.cpython-39-x86_64-linux-gnu.so
Module /usr/local/lib/python3.9/site-packages/h5py/h5ds.cpython-39-x86_64-linux-gnu.so without build-id.
Module /usr/local/lib/python3.9/site-packages/h5py/h5ds.cpython-39-x86_64-linux-gnu.so
Module /usr/local/lib/python3.9/site-packages/h5py/h5d.cpython-39-x86_64-linux-gnu.so without build-id.
Module /usr/local/lib/python3.9/site-packages/h5py/h5d.cpython-39-x86_64-linux-gnu.so
Module /usr/local/lib/python3.9/site-packages/h5py/_proxy.cpython-39-x86_64-linux-gnu.so without build-id.
Module /usr/local/lib/python3.9/site-packages/h5py/_proxy.cpython-39-x86_64-linux-gnu.so
Module /usr/local/lib/python3.9/site-packages/h5py/h5a.cpython-39-x86_64-linux-gnu.so without build-id.
Module /usr/local/lib/python3.9/site-packages/h5py/h5a.cpython-39-x86_64-linux-gnu.so
Module /usr/local/lib/python3.9/site-packages/h5py/h5z.cpython-39-x86_64-linux-gnu.so without build-id.
Module /usr/local/lib/python3.9/site-packages/h5py/h5z.cpython-39-x86_64-linux-gnu.so
Module /usr/local/lib/python3.9/site-packages/h5py/h5ac.cpython-39-x86_64-linux-gnu.so without build-id.
Module /usr/local/lib/python3.9/site-packages/h5py/h5ac.cpython-39-x86_64-linux-gnu.so
Module /usr/local/lib/python3.9/site-packages/h5py/utils.cpython-39-x86_64-linux-gnu.so without build-id.
Module /usr/local/lib/python3.9/site-packages/h5py/utils.cpython-39-x86_64-linux-gnu.so
Module /usr/local/lib/python3.9/site-packages/h5py/h5s.cpython-39-x86_64-linux-gnu.so without build-id.
Module /usr/local/lib/python3.9/site-packages/h5py/h5s.cpython-39-x86_64-linux-gnu.so
Module /usr/local/lib/python3.9/site-packages/sklearn/cluster/_hierarchical_fast.cpython-39-x86_64-linux-gnu.so without build-id.
Module /usr/local/lib/python3.9/site-packages/sklearn/cluster/_hierarchical_fast.cpython-39-x86_64-linux-gnu.so
Module /usr/local/lib/python3.9/site-packages/sklearn/utils/_fast_dict.cpython-39-x86_64-linux-gnu.so without build-id.
Module /usr/local/lib/python3.9/site-packages/sklearn/utils/_fast_dict.cpython-39-x86_64-linux-gnu.so
Module /usr/local/lib/python3.9/site-packages/h5py/h5p.cpython-39-x86_64-linux-gnu.so without build-id.
Module /usr/local/lib/python3.9/site-packages/h5py/h5p.cpython-39-x86_64-linux-gnu.so
Module /usr/local/lib/python3.9/site-packages/sklearn/cluster/_k_means_elkan.cpython-39-x86_64-linux-gnu.so without build-id.
Module /usr/local/lib/python3.9/site-packages/sklearn/cluster/_k_means_elkan.cpython-39-x86_64-linux-gnu.so
Module /usr/local/lib/python3.9/site-packages/sklearn/cluster/_k_means_lloyd.cpython-39-x86_64-linux-gnu.so without build-id.
Module /usr/local/lib/python3.9/site-packages/sklearn/cluster/_k_means_lloyd.cpython-39-x86_64-linux-gnu.so
Module /usr/local/lib/python3.9/site-packages/sklearn/cluster/_k_means_minibatch.cpython-39-x86_64-linux-gnu.so without build-id.
Module /usr/local/lib/python3.9/site-packages/sklearn/cluster/_k_means_minibatch.cpython-39-x86_64-linux-gnu.so
Module /usr/local/lib/python3.9/site-packages/sklearn/cluster/_k_means_common.cpython-39-x86_64-linux-gnu.so without build-id.
Module /usr/local/lib/python3.9/site-packages/sklearn/cluster/_k_means_common.cpython-39-x86_64-linux-gnu.so
Module /usr/local/lib/python3.9/site-packages/h5py/h5t.cpython-39-x86_64-linux-gnu.so without build-id.
Module /usr/local/lib/python3.9/site-packages/h5py/h5t.cpython-39-x86_64-linux-gnu.so
Module /usr/local/lib/python3.9/site-packages/sklearn/tree/_utils.cpython-39-x86_64-linux-gnu.so without build-id.
Module /usr/local/lib/python3.9/site-packages/sklearn/tree/_utils.cpython-39-x86_64-linux-gnu.so
Module /usr/local/lib/python3.9/site-packages/sklearn/tree/_tree.cpython-39-x86_64-linux-gnu.so without build-id.
Module /usr/local/lib/python3.9/site-packages/sklearn/tree/_tree.cpython-39-x86_64-linux-gnu.so
Module /usr/local/lib/python3.9/site-packages/sklearn/tree/_splitter.cpython-39-x86_64-linux-gnu.so without build-id.
Module /usr/local/lib/python3.9/site-packages/sklearn/tree/_splitter.cpython-39-x86_64-linux-gnu.so
Module /usr/local/lib/python3.9/site-packages/sklearn/neighbors/_quad_tree.cpython-39-x86_64-linux-gnu.so without build-id.
Module /usr/local/lib/python3.9/site-packages/sklearn/neighbors/_quad_tree.cpython-39-x86_64-linux-gnu.so
Module /usr/local/lib/python3.9/site-packages/h5py/h5r.cpython-39-x86_64-linux-gnu.so without build-id.
Module /usr/local/lib/python3.9/site-packages/h5py/h5r.cpython-39-x86_64-linux-gnu.so
Module /usr/local/lib/python3.9/site-packages/sklearn/tree/_criterion.cpython-39-x86_64-linux-gnu.so without build-id.
Module /usr/local/lib/python3.9/site-packages/sklearn/tree/_criterion.cpython-39-x86_64-linux-gnu.so
Module /usr/local/lib/python3.9/site-packages/sklearn/_isotonic.cpython-39-x86_64-linux-gnu.so without build-id.
Module /usr/local/lib/python3.9/site-packages/sklearn/_isotonic.cpython-39-x86_64-linux-gnu.so
Module /usr/local/lib/python3.9/site-packages/sklearn/svm/_libsvm_sparse.cpython-39-x86_64-linux-gnu.so without build-id.
Module /usr/local/lib/python3.9/site-packages/sklearn/svm/_libsvm_sparse.cpython-39-x86_64-linux-gnu.so
Module /usr/local/lib/python3.9/site-packages/sklearn/svm/_libsvm.cpython-39-x86_64-linux-gnu.so without build-id.
Module /usr/local/lib/python3.9/site-packages/sklearn/svm/_libsvm.cpython-39-x86_64-linux-gnu.so
Module /usr/local/lib/python3.9/site-packages/h5py/_conv.cpython-39-x86_64-linux-gnu.so without build-id.
Module /usr/local/lib/python3.9/site-packages/h5py/_conv.cpython-39-x86_64-linux-gnu.so
Module /usr/local/lib/python3.9/site-packages/sklearn/linear_model/_sgd_fast.cpython-39-x86_64-linux-gnu.so without build-id.
Module /usr/local/lib/python3.9/site-packages/sklearn/linear_model/_sgd_fast.cpython-39-x86_64-linux-gnu.so
Module /usr/local/lib/python3.9/site-packages/sklearn/linear_model/_cd_fast.cpython-39-x86_64-linux-gnu.so without build-id.
Module /usr/local/lib/python3.9/site-packages/sklearn/linear_model/_cd_fast.cpython-39-x86_64-linux-gnu.so
Module /usr/local/lib/python3.9/site-packages/sklearn/utils/_cython_blas.cpython-39-x86_64-linux-gnu.so without build-id.
Module /usr/local/lib/python3.9/site-packages/sklearn/utils/_cython_blas.cpython-39-x86_64-linux-gnu.so
Module /usr/local/lib/python3.9/site-packages/sklearn/utils/arrayfuncs.cpython-39-x86_64-linux-gnu.so without build-id.
Module /usr/local/lib/python3.9/site-packages/sklearn/utils/arrayfuncs.cpython-39-x86_64-linux-gnu.so
Module /usr/local/lib/python3.9/site-packages/h5py/_objects.cpython-39-x86_64-linux-gnu.so without build-id.
Module /usr/local/lib/python3.9/site-packages/h5py/_objects.cpython-39-x86_64-linux-gnu.so
Module /usr/local/lib/python3.9/site-packages/sklearn/neighbors/_kd_tree.cpython-39-x86_64-linux-gnu.so without build-id.
Module /usr/local/lib/python3.9/site-packages/sklearn/neighbors/_kd_tree.cpython-39-x86_64-linux-gnu.so
Module /usr/local/lib/python3.9/site-packages/sklearn/neighbors/_ball_tree.cpython-39-x86_64-linux-gnu.so without build-id.
Module /usr/local/lib/python3.9/site-packages/sklearn/neighbors/_ball_tree.cpython-39-x86_64-linux-gnu.so
Module /usr/local/lib/python3.9/site-packages/sklearn/metrics/_pairwise_fast.cpython-39-x86_64-linux-gnu.so without build-id.
Module /usr/local/lib/python3.9/site-packages/sklearn/metrics/_pairwise_fast.cpython-39-x86_64-linux-gnu.so
Module /usr/local/lib/python3.9/site-packages/h5py/defs.cpython-39-x86_64-linux-gnu.so without build-id.
Module /usr/local/lib/python3.9/site-packages/h5py/defs.cpython-39-x86_64-linux-gnu.so
Module /usr/local/lib/python3.9/site-packages/sklearn/metrics/_dist_metrics.cpython-39-x86_64-linux-gnu.so without build-id.
Module /usr/local/lib/python3.9/site-packages/sklearn/metrics/_dist_metrics.cpython-39-x86_64-linux-gnu.so
Module /usr/local/lib/python3.9/site-packages/sklearn/preprocessing/_csr_polynomial_expansion.cpython-39-x86_64-linux-gnu.so without build-id.
Module /usr/local/lib/python3.9/site-packages/sklearn/preprocessing/_csr_polynomial_expansion.cpython-39-x86_64-linux-gnu.so
Module /usr/local/lib/python3.9/site-packages/sklearn/utils/sparsefuncs_fast.cpython-39-x86_64-linux-gnu.so without build-id.
Module /usr/local/lib/python3.9/site-packages/sklearn/utils/sparsefuncs_fast.cpython-39-x86_64-linux-gnu.so
Module /usr/local/lib/python3.9/site-packages/h5py/h5.cpython-39-x86_64-linux-gnu.so without build-id.
Module /usr/local/lib/python3.9/site-packages/h5py/h5.cpython-39-x86_64-linux-gnu.so
Module /usr/local/lib/python3.9/site-packages/h5py.libs/libaec-9c9e97eb.so.0.0.10 without build-id.
Module /usr/local/lib/python3.9/site-packages/h5py.libs/libaec-9c9e97eb.so.0.0.10
Module /usr/local/lib/python3.9/site-packages/h5py.libs/libsz-090daab4.so.2.0.1 without build-id.
Module /usr/local/lib/python3.9/site-packages/h5py.libs/libsz-090daab4.so.2.0.1
Module /usr/local/lib/python3.9/site-packages/sklearn/utils/_readonly_array_wrapper.cpython-39-x86_64-linux-gnu.so without build-id.
Module /usr/local/lib/python3.9/site-packages/sklearn/utils/_readonly_array_wrapper.cpython-39-x86_64-linux-gnu.so
Module /usr/local/lib/python3.9/site-packages/h5py.libs/libhdf5_hl-84bfe2a0.so.200.0.1 without build-id.
Module /usr/local/lib/python3.9/site-packages/h5py.libs/libhdf5_hl-84bfe2a0.so.200.0.1
Module /usr/local/lib/python3.9/site-packages/h5py.libs/libhdf5-346dbfc8.so.200.1.0 without build-id.
Module /usr/local/lib/python3.9/site-packages/h5py.libs/libhdf5-346dbfc8.so.200.1.0
Module /usr/local/lib/python3.9/site-packages/h5py/_errors.cpython-39-x86_64-linux-gnu.so without build-id.
Module /usr/local/lib/python3.9/site-packages/h5py/_errors.cpython-39-x86_64-linux-gnu.so
Module /usr/local/lib/python3.9/site-packages/sklearn/manifold/_barnes_hut_tsne.cpython-39-x86_64-linux-gnu.so without build-id.
Module /usr/local/lib/python3.9/site-packages/sklearn/manifold/_barnes_hut_tsne.cpython-39-x86_64-linux-gnu.so
Module /usr/local/lib/python3.9/site-packages/netCDF4.libs/libz-a147dcb0.so.1.2.3 without build-id.
Module /usr/local/lib/python3.9/site-packages/netCDF4.libs/libz-a147dcb0.so.1.2.3
Module /usr/local/lib/python3.9/site-packages/sklearn/svm/_liblinear.cpython-39-x86_64-linux-gnu.so without build-id.
Module /usr/local/lib/python3.9/site-packages/sklearn/svm/_liblinear.cpython-39-x86_64-linux-gnu.so
Module /usr/local/lib/python3.9/site-packages/netCDF4.libs/libcurl-33f5ac06.so.4.6.0 without build-id.
Module /usr/local/lib/python3.9/site-packages/netCDF4.libs/libcurl-33f5ac06.so.4.6.0
Module /usr/local/lib/python3.9/site-packages/netCDF4.libs/libaec-f0d4887b.so.0.0.10 without build-id.
Module /usr/local/lib/python3.9/site-packages/netCDF4.libs/libaec-f0d4887b.so.0.0.10
Module /usr/local/lib/python3.9/site-packages/sklearn/utils/_weight_vector.cpython-39-x86_64-linux-gnu.so without build-id.
Module /usr/local/lib/python3.9/site-packages/sklearn/utils/_weight_vector.cpython-39-x86_64-linux-gnu.so
Module /usr/local/lib/python3.9/site-packages/netCDF4.libs/libhdf5-5d1f23d4.so.103.1.0 without build-id.
Module /usr/local/lib/python3.9/site-packages/netCDF4.libs/libhdf5-5d1f23d4.so.103.1.0
Module /usr/local/lib/python3.9/site-packages/scipy/special/_ellip_harm_2.cpython-39-x86_64-linux-gnu.so without build-id.
Module /usr/local/lib/python3.9/site-packages/scipy/special/_ellip_harm_2.cpython-39-x86_64-linux-gnu.so
Module /usr/local/lib/python3.9/site-packages/sklearn/linear_model/_sag_fast.cpython-39-x86_64-linux-gnu.so without build-id.
Module /usr/local/lib/python3.9/site-packages/sklearn/linear_model/_sag_fast.cpython-39-x86_64-linux-gnu.so
Module /usr/local/lib/python3.9/site-packages/sklearn/decomposition/_cdnmf_fast.cpython-39-x86_64-linux-gnu.so without build-id.
Module /usr/local/lib/python3.9/site-packages/sklearn/decomposition/_cdnmf_fast.cpython-39-x86_64-linux-gnu.so
Module /usr/local/lib/python3.9/site-packages/scipy/linalg/cython_lapack.cpython-39-x86_64-linux-gnu.so without build-id.
Module /usr/local/lib/python3.9/site-packages/scipy/linalg/cython_lapack.cpython-39-x86_64-linux-gnu.so
Module /usr/local/lib/python3.9/site-packages/scipy/linalg/cython_blas.cpython-39-x86_64-linux-gnu.so without build-id.
Module /usr/local/lib/python3.9/site-packages/scipy/linalg/cython_blas.cpython-39-x86_64-linux-gnu.so
Module /usr/local/lib/python3.9/site-packages/sklearn/utils/_logistic_sigmoid.cpython-39-x86_64-linux-gnu.so without build-id.
Module /usr/local/lib/python3.9/site-packages/sklearn/utils/_logistic_sigmoid.cpython-39-x86_64-linux-gnu.so
Module /usr/local/lib/python3.9/site-packages/scipy/linalg/_flinalg.cpython-39-x86_64-linux-gnu.so without build-id.
Module /usr/local/lib/python3.9/site-packages/scipy/linalg/_flinalg.cpython-39-x86_64-linux-gnu.so
Module /usr/local/lib/python3.9/site-packages/sklearn/utils/_seq_dataset.cpython-39-x86_64-linux-gnu.so without build-id.
Module /usr/local/lib/python3.9/site-packages/sklearn/utils/_seq_dataset.cpython-39-x86_64-linux-gnu.so
Module /usr/local/lib/python3.9/site-packages/scipy/linalg/_flapack.cpython-39-x86_64-linux-gnu.so without build-id.
Module /usr/local/lib/python3.9/site-packages/scipy/linalg/_flapack.cpython-39-x86_64-linux-gnu.so
Module /usr/local/lib/python3.9/site-packages/sklearn/cluster/_dbscan_inner.cpython-39-x86_64-linux-gnu.so without build-id.
Module /usr/local/lib/python3.9/site-packages/sklearn/cluster/_dbscan_inner.cpython-39-x86_64-linux-gnu.so
Module /usr/local/lib/python3.9/site-packages/scipy/linalg/_fblas.cpython-39-x86_64-linux-gnu.so without build-id.
Module /usr/local/lib/python3.9/site-packages/scipy/linalg/_fblas.cpython-39-x86_64-linux-gnu.so
Module /usr/local/lib/python3.9/site-packages/sklearn/utils/_random.cpython-39-x86_64-linux-gnu.so without build-id.
Module /usr/local/lib/python3.9/site-packages/sklearn/utils/_random.cpython-39-x86_64-linux-gnu.so
Module /usr/local/lib/python3.9/site-packages/scipy/special/specfun.cpython-39-x86_64-linux-gnu.so without build-id.
Module /usr/local/lib/python3.9/site-packages/scipy/special/specfun.cpython-39-x86_64-linux-gnu.so
Module /usr/local/lib/python3.9/site-packages/sklearn/metrics/cluster/_expected_mutual_info_fast.cpython-39-x86_64-linux-gnu.so without build-id.
Module /usr/local/lib/python3.9/site-packages/sklearn/metrics/cluster/_expected_mutual_info_fast.cpython-39-x86_64-linux-gnu.so
Module /usr/local/lib/python3.9/site-packages/netCDF4.libs/libsz-53d02de5.so.2.0.1 without build-id.
Module /usr/local/lib/python3.9/site-packages/netCDF4.libs/libsz-53d02de5.so.2.0.1
Module /usr/local/lib/python3.9/site-packages/sklearn/manifold/_utils.cpython-39-x86_64-linux-gnu.so without build-id.
Module /usr/local/lib/python3.9/site-packages/sklearn/manifold/_utils.cpython-39-x86_64-linux-gnu.so
Module /usr/local/lib/python3.9/site-packages/sklearn/utils/murmurhash.cpython-39-x86_64-linux-gnu.so without build-id.
Module /usr/local/lib/python3.9/site-packages/sklearn/utils/murmurhash.cpython-39-x86_64-linux-gnu.so
Module /usr/local/lib/python3.9/site-packages/netCDF4.libs/libhdf5_hl-14f94ac1.so.100.1.2 without build-id.
Module /usr/local/lib/python3.9/site-packages/netCDF4.libs/libhdf5_hl-14f94ac1.so.100.1.2
Module /usr/local/lib/python3.9/site-packages/netCDF4.libs/libnetcdf-2ecdc039.so.15.0.0 without build-id.
Module /usr/local/lib/python3.9/site-packages/netCDF4.libs/libnetcdf-2ecdc039.so.15.0.0
Module /usr/local/lib/python3.9/site-packages/netCDF4/_netCDF4.cpython-39-x86_64-linux-gnu.so without build-id.
Module /usr/local/lib/python3.9/site-packages/netCDF4/_netCDF4.cpython-39-x86_64-linux-gnu.so
Module /usr/local/lib/python3.9/site-packages/scipy.libs/libgfortran-ed201abd.so.3.0.0 without build-id.
Module /usr/local/lib/python3.9/site-packages/scipy.libs/libgfortran-ed201abd.so.3.0.0
Module /usr/local/lib/python3.9/site-packages/scipy.libs/libopenblasp-r0-085ca80a.3.9.so without build-id.
Module /usr/local/lib/python3.9/site-packages/scipy.libs/libopenblasp-r0-085ca80a.3.9.so
Module /usr/local/lib/python3.9/site-packages/scipy/special/_ufuncs.cpython-39-x86_64-linux-gnu.so without build-id.
Module /usr/local/lib/python3.9/site-packages/scipy/special/_ufuncs.cpython-39-x86_64-linux-gnu.so
Module /usr/local/lib/python3.9/site-packages/sklearn/decomposition/_online_lda_fast.cpython-39-x86_64-linux-gnu.so without build-id.
Module /usr/local/lib/python3.9/site-packages/sklearn/decomposition/_online_lda_fast.cpython-39-x86_64-linux-gnu.so
Module /usr/local/lib/python3.9/site-packages/sklearn/neighbors/_partition_nodes.cpython-39-x86_64-linux-gnu.so without build-id.
Module /usr/local/lib/python3.9/site-packages/sklearn/neighbors/_partition_nodes.cpython-39-x86_64-linux-gnu.so
Module /usr/local/lib/python3.9/site-packages/sklearn/utils/_typedefs.cpython-39-x86_64-linux-gnu.so without build-id.
Module /usr/local/lib/python3.9/site-packages/sklearn/utils/_typedefs.cpython-39-x86_64-linux-gnu.so
Module /usr/local/lib/python3.9/site-packages/MDAnalysis/lib/c_distances_openmp.cpython-39-x86_64-linux-gnu.so without build-id.
Module /usr/local/lib/python3.9/site-packages/MDAnalysis/lib/c_distances_openmp.cpython-39-x86_64-linux-gnu.so
Module /usr/local/lib/python3.9/site-packages/sklearn/utils/_openmp_helpers.cpython-39-x86_64-linux-gnu.so without build-id.
Module /usr/local/lib/python3.9/site-packages/sklearn/utils/_openmp_helpers.cpython-39-x86_64-linux-gnu.so
Module /usr/local/lib/python3.9/site-packages/sklearn/__check_build/_check_build.cpython-39-x86_64-linux-gnu.so without build-id.
Module /usr/local/lib/python3.9/site-packages/sklearn/__check_build/_check_build.cpython-39-x86_64-linux-gnu.so
Module /usr/local/lib/python3.9/site-packages/numpy/linalg/lapack_lite.cpython-39-x86_64-linux-gnu.so without build-id.
Module /usr/local/lib/python3.9/site-packages/numpy.libs/libquadmath-96973f99.so.0.0.0 without build-id.
Module /usr/local/lib/python3.9/site-packages/numpy.libs/libquadmath-96973f99.so.0.0.0
Module /usr/local/lib/python3.9/site-packages/numpy/linalg/_umath_linalg.cpython-39-x86_64-linux-gnu.so without build-id.
Module /usr/local/lib/python3.9/site-packages/numpy/linalg/_umath_linalg.cpython-39-x86_64-linux-gnu.so
Module /usr/local/lib/python3.9/site-packages/numpy.libs/libgfortran-040039e1.so.5.0.0 without build-id.
Module /usr/local/lib/python3.9/site-packages/numpy.libs/libgfortran-040039e1.so.5.0.0
Module /usr/local/lib/python3.9/site-packages/numpy.libs/libopenblas64_p-r0-2f7c42d4.3.18.so without build-id.
Module /usr/local/lib/python3.9/site-packages/numpy.libs/libopenblas64_p-r0-2f7c42d4.3.18.so
Module /usr/local/lib/python3.9/site-packages/numpy/core/_multiarray_umath.cpython-39-x86_64-linux-gnu.so without build-id.
Module /usr/local/lib/python3.9/site-packages/numpy/core/_multiarray_umath.cpython-39-x86_64-linux-gnu.so
Module /usr/local/lib/python3.9/site-packages/Pillow.libs/libXau-154567c4.so.6.0.0 without build-id.
Module /usr/local/lib/python3.9/site-packages/Pillow.libs/libXau-154567c4.so.6.0.0
Module /usr/local/lib/python3.9/site-packages/Pillow.libs/liblzma-160b9c62.so.5.4.0 without build-id.
Module /usr/local/lib/python3.9/site-packages/Pillow.libs/liblzma-160b9c62.so.5.4.0
Module /usr/local/lib/python3.9/site-packages/Pillow.libs/libxcb-3e83370d.so.1.1.0 without build-id.
Module /usr/local/lib/python3.9/site-packages/Pillow.libs/libxcb-3e83370d.so.1.1.0
Module /usr/local/lib/python3.9/site-packages/Pillow.libs/libtiff-b9364ff1.so.6.0.0 without build-id.
Module /usr/local/lib/python3.9/site-packages/Pillow.libs/libtiff-b9364ff1.so.6.0.0
Module /usr/local/lib/python3.9/site-packages/Pillow.libs/libopenjp2-78c47f58.so.2.5.0 without build-id.
Module /usr/local/lib/python3.9/site-packages/Pillow.libs/libopenjp2-78c47f58.so.2.5.0
Module /usr/local/lib/python3.9/site-packages/Pillow.libs/libjpeg-16b2c4cf.so.62.3.0 without build-id.
Module /usr/local/lib/python3.9/site-packages/Pillow.libs/libjpeg-16b2c4cf.so.62.3.0
Module /usr/local/lib/python3.9/site-packages/PIL/_imaging.cpython-39-x86_64-linux-gnu.so without build-id.
Module /usr/local/lib/python3.9/site-packages/PIL/_imaging.cpython-39-x86_64-linux-gnu.so
Stack trace of thread 26:
#0 0x00007f074c1fc88c n/a (/usr/local/lib/libpython3.9.so.1.0 + 0x1af88c)
#1 0x00007f06d73c9000 n/a (n/a + 0x0)
ELF object binary architecture: AMD x86-64
Mar 28 14:12:38 u-030-s007 [280209]: Could not parse number of program headers from core file: invalid `Elf' handle
Mar 28 14:12:38 u-030-s007 [280209]: Could not parse number of program headers from core file: invalid `Elf' handle
(uid 1000 is the nomad user)
I have no clue what is going on. The server has 16 GiB RAM so IMHO an OOM event is rather unlikely (but not impossible).
This affects the sample files Cu2Se/2/aims.out
and Sn2Se/1/INFO_GS.OUT
. All other files from this sample bundle do work.
Edit: NOMAD version is now 1.2.2.dev357+g15b7cd2e1
@behnle: I will try to reproduce the problem and see why the parser is struggling with this example.
I can confirm that at least one particular main file seems to use a very large amount of RAM, ultimately causing the process to be killed. Here is the zip: int_hse.zip file, the problematic file is output_1
.
We need to check what is causing the memory usage to blow up in the FHI-aims parser for this file. In general some calculations are very big and will need a lot of RAM to be processed, but this does not look like one to me. @ndaelman-hu, @JosePizarro3 : Could you investigate this a bit?
I can confirm that at least one particular main file seems to use a very large amount of RAM, ultimately causing the process to be killed. Here is the zip: int_hse.zip file, the problematic file is
output_1
.We need to check what is causing the memory usage to blow up in the FHI-aims parser for this file. In general some calculations are very big and will need a lot of RAM to be processed, but this does not look like one to me. @ndaelman-hu, @JosePizarro3 : Could you investigate this a bit?
It's likely this basis set tier checker. I'll see to slim it down.
The issue is FHIAimsOutParser
: trying to get
its data causes a memory leak.
The native tier files is still big (~1 GB in RAM), but not the culprit.
Am investigating further.
When trying to visualize the
CuSe FHI-aims GeometryOptimization simulation
sample, NOMAD crashes with a python error:In the GUI, this triggers an internal server error:
Unexpected error: "[object Object] (500)". Please try again and let us know, if this error keeps happening.
No cell or workflow graph is shown. NOMAD version is1.2.2.dev295+g2e611aff1
.