RECETOX / galaxytools

Set of Galaxy tool wrappers developed at RECETOX
MIT License
13 stars 13 forks source link

matchms metadata match memory allocation #425

Closed hechth closed 4 months ago

hechth commented 11 months ago
Traceback (most recent call last):
  File "/mnt/volume/shared/ces-nya/nfs4/home/umsa/job_working_directory_object/088/88329/configs/tmpcj8be9cj", line 21, in <module>
    layer = similarity.matrix(
            ^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/matchms/similarity/MetadataMatch.py", line 153, in matrix
    scores = np.zeros((len(entries_ref), len(entries_query)))
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
numpy.core._exceptions._ArrayMemoryError: Unable to allocate 70.1 GiB for an array with shape (209427, 44936) and data type float64

Seems like matchms wants to allocate this amount of memory for the array - this should not be needed with the sparse computation.

We need to check whether the galaxy tool can be written in such a way that this is avoided.

hechth commented 11 months ago

xref https://github.com/matchms/matchms/issues/544

hechth commented 6 months ago

https://github.com/matchms/matchms/blob/1ba681524c2adff5e41e5df8399955a2e71164dc/matchms/similarity/MetadataMatch.py#L152-L160

zargham-ahmad commented 4 months ago

@hechth this can be closed, should be fixed with the PR https://github.com/RECETOX/galaxytools/pull/544