biosustain / pyOpenMS_UmetaFlow

Untargeted metabolomics workflow for data processing and analysis written in Jupyter notebooks (Python)
Apache License 2.0
18 stars 7 forks source link

Notebook 4 GNPS export memory error #2

Open andrewjkwok opened 1 year ago

andrewjkwok commented 1 year ago

Hi,

Thank you for these detailed notebooks and workflow.

I managed to get to notebook 4 with my own dataset but in step 3 for MSMS clustering, when I try to run the line:

spectra_clustering.store(Consensus_file, [s.encode() for s in mzML_files], String(out_file))

I run into what seems like an error with memory:

---------------------------------------------------------------------------
MemoryError                               Traceback (most recent call last)
Cell In[9], line 1
----> 1 spectra_clustering.store(Consensus_file, [s.encode() for s in mzML_files], String(out_file))

File pyopenms/_pyopenms_1.pyx:3028, in pyopenms._pyopenms_1.GNPSMGFFile.store()

MemoryError: std::bad_alloc

However I am very sure my machine is not running out of memory. Happy to provide more information but not sure what to - I'm very puzzled as to why this is happening?

pcolaianni commented 1 year ago

I noticed this issue in my github inbox only now.

In case you still need help with this, I would recommend looking at the implementation of the class that is throwing the exception: https://github.com/OpenMS/OpenMS/blame/develop/src/openms/source/FORMAT/GNPSMGFFile.cpp

Which part do you think might make your code fail? Line 315 is from June 2023. Depending on what version of OpenMS you were running, you might have to look at another file revision.