compomics / ms2rescore

Modular and user-friendly platform for AI-assisted rescoring of peptide identifications
https://ms2rescore.readthedocs.io
Apache License 2.0
39 stars 14 forks source link

MemoryError with input file containing multiple million PSMs (DIA data) #146

Open Pichler98 opened 2 months ago

Pichler98 commented 2 months ago

Hello,

i am using MS2Rescore 3.0.0 GUI (same error on CLI) with MS Amanda processed DIA data (resulting CSV contains two new rows, which should not be a problem), therefore I get a lot of PSMs. When running MS2Rescore i get this MemoryError after a while: ms2_rescore_error here the output in the GUI when the error happens ms2_rescore_progress

The system where the program is running already has 80GB of RAM available, therefore it is surprising to see this error, but I figured the big amount of data might be the problem. Are you able to help with this error? Or is it better splitting the data into chunks and process them individually? Because I am not sure if this might change the quality of the result. (It is the first time using MS2Rescore, so some help is appreciated)

The used data and the configuration file can be found here (~4GB): https://drive.google.com/file/d/1rUX8kuWl-NaEGxr9-3KY-Uzpd6PBMMgn/view?usp=sharing

Thanks, Severin