ISA-tools / mzml2isa

Parser to get meta information from mzML file and parse relevant information to a ISA-Tab structure
GNU General Public License v3.0
12 stars 6 forks source link

High memory consumption #45

Open sneumann opened 3 years ago

sneumann commented 3 years ago

Hi, I have a set of ~90 mzML files totalling ~40GB from an Orbitrap Elite converted via msconvert 3.0.11110. If I run mzml2isa on them, the process memory according to top goes to 0.015TB(!) before eventually being OOM-killed. This is on mzml2isa 1.0.3 I have only started to debug, will report back.

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND                                                                                                                                
 3167 sneumann  20   0 15,555g 0,015t  19088 S  92,4 23,7  23:41.80 mzml2isa                                                                                                                               

If there are any known memory caveats I'd be happy about a brief hint. Yours, Steffen

sneumann commented 3 years ago

Ok, works for a single file, trying on a different machine now. Will report back. Steffen

sneumann commented 8 months ago

Hi, just coming back to this, in a different study we have a whopping 100*10GB=1TB of mzML, so memory efficiency is coming back as important issue, even on a machine with 256GB RAM. Processing a single Iteration over files happens here: https://github.com/ISA-tools/mzml2isa/blob/a946af32ce632438adf2ddf6d077847d909352e6/mzml2isa/parsing.py#L129 so I am wondering if somewhere objects are not freed. Not a python expert, so I wouldn't know how release and garbage collection works. Yours, Steffen