ISA-tools / mzml2isa

Parser to get meta information from mzML file and parse relevant information to a ISA-Tab structure
GNU General Public License v3.0
12 stars 6 forks source link

Memora leak ? #44

Closed sneumann closed 3 years ago

sneumann commented 3 years ago

Hi, I have a set of ~90 mzML files totalling ~40GB from an Orbitrap Elite converted via msconvert 3.0.11110. If I run mzml2isa on them, the process memory according to top goes to 0.015TB(!) before eventually being OOM-killed. This is on mzml2isa 1.0.3 I have only started to debug, will report back.

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND                                                                                                                                
 3167 sneumann  20   0 15,555g 0,015t  19088 S  92,4 23,7  23:41.80 mzml2isa                                                                                                                               

If there are any known memory caveats I'd be happy about a brief hint. Yours, Steffen

Tomnl commented 3 years ago

Hi Steffen,

Could you send the output of pip freeze in the environment you are using?

Tomnl commented 3 years ago

@althonos - Do you remember if we are storing the binary data when we parse?

I remember there were some updates that improved memory usage (not including the unmerged branch from 2017)

sneumann commented 3 years ago

Hi, in the end I successfully ran the mzml2isa. The machine where things didn't go through was:

sneumann@wecker:~$ pip3 freeze
acme==0.31.0
appdirs==1.4.3
apturl==0.5.2
asn1crypto==0.22.0
Babel==1.3
backcall==0.1.0
beautifulsoup4==4.4.1
bleach==3.1.0
blinker==1.3
Brlapi==0.6.4
cached-property==1.5.1
certbot==0.31.0
certifi==2017.4.17
chardet==3.0.4
checkbox-support==0.22
cliff==1.15.0
cmd2==0.6.8
command-not-found==0.3
ConfigArgParse==0.11.0
configobj==5.0.6
cryptography==1.9
cycler==0.10.0
debtcollector==1.3.0
decorator==4.4.0
defer==1.0.6
defusedxml==0.5.0
docutils==0.12
entrypoints==0.3
et-xmlfile==1.0.1
feedparser==5.1.3
fs==2.4.11
funcsigs==0.4
future==0.15.2
guacamole==0.9.2
html5lib==0.999
httplib2==0.9.1
idna==2.5
ipykernel==5.1.0
ipyrmd==0.4.3
ipython==7.4.0
ipython-genutils==0.2.0
ipywidgets==7.4.2
iso8601==0.1.11
jdcal==1.4.1
jedi==0.13.3
Jinja2==2.8
josepy==1.1.0
jsonpatch==1.10
jsonpointer==1.9
jsonschema==2.5.1
jupyter==1.0.0
jupyter-client==5.2.4
jupyter-console==6.0.0
jupyter-core==4.4.0
keyring==7.3
keystoneauth1==2.4.1
kiwisolver==1.1.0
language-selector==0.1
louis==2.6.4
lxml==3.5.0
Mako==1.0.3
MarkupSafe==0.23
matplotlib==3.0.3
mistune==0.8.4
mock==1.3.0
monotonic==0.6
msgpack-python==0.4.6
mzml2isa==1.0.3
nbconvert==5.4.1
nbformat==4.4.0
ndg-httpsclient==0.4.2
netaddr==0.7.18
netifaces==0.10.4
notebook==5.7.8
numpy==1.18.5
oauthlib==1.0.3
onboard==1.2.0
openpyxl==2.6.4
openstacksdk==0.8.1
os-client-config==1.16.0
oslo.config==3.9.0
oslo.i18n==3.5.0
oslo.serialization==2.4.0
oslo.utils==3.8.0
padme==1.1.1
pandocfilters==1.4.2
parsedatetime==2.4
parso==0.4.0
pbr==1.8.0
pexpect==4.0.1
pickleshare==0.7.5
Pillow==3.1.2
plainbox==0.25
positional==1.0.1
prettytable==0.7.2
prometheus-client==0.6.0
prompt-toolkit==2.0.9
pronto==0.12.2
ptyprocess==0.5
PubChemPy==1.0.4
pyasn1==0.1.9
pycrypto==2.6.1
pycups==1.9.73
pycurl==7.43.0
Pygments==2.1
pygobject==3.20.0
PyICU==1.9.2
PyJWT==1.3.0
pyOpenSSL==17.3.0
pyparsing==2.0.3
pyRFC3339==1.0
python-apt==1.1.0b1+ubuntu0.16.4.9
python-cinderclient==1.6.0
python-dateutil==2.8.0
python-debian==0.1.27
python-glanceclient==2.0.0
python-keystoneclient==2.3.1
python-neutronclient==4.1.1
python-novaclient==3.3.1
python-openstackclient==2.3.1
python-systemd==231
pytz==2019.3
pyxdg==0.25
PyYAML==3.11
pyzmq==18.0.1
qtconsole==4.4.3
reportlab==3.3.0
requests==2.18.1
requests-toolbelt==0.8.0
requestsexceptions==1.1.2
roman==2.0.0
SecretStorage==2.1.3
Send2Trash==1.5.0
sessioninstaller==0.0.0
simplejson==3.8.1
six==1.14.0
ssh-import-id==5.5
stevedore==1.12.0
system-service==0.3
terminado==0.8.2
testpath==0.4.2
tornado==6.0.2
traitlets==4.3.2
typing==3.7.4.1
ubuntu-drivers-common==0.0.0
ufw==0.35
unattended-upgrades==0.1
unicodecsv==0.14.1
unity-scope-calculator==0.1
unity-scope-chromiumbookmarks==0.1
unity-scope-colourlovers==0.1
unity-scope-devhelp==0.1
unity-scope-firefoxbookmarks==0.1
unity-scope-gdrive==0.7
unity-scope-manpages==0.1
unity-scope-openclipart==0.1
unity-scope-texdoc==0.1
unity-scope-tomboy==0.1
unity-scope-virtualbox==0.1
unity-scope-yelp==0.1
unity-scope-zotero==0.1
urllib3==1.21.1
usb-creator==0.3.0
vboxapi==1.0
warlock==1.1.0
wcwidth==0.1.7
webencodings==0.5.1
widgetsnbextension==3.4.2
wrapt==1.8.0
xdiagnose==3.8.4.1
xkit==0.0.0
XlsxWriter==0.7.3
zope.component==4.3.0
zope.event==4.2.0
zope.hookable==4.0.4
zope.interface==4.3.2

and in the end it worked in:

sneumann@msbi-koch:~$ pip3 freeze 
appdirs==1.4.3
apturl==0.5.2
beautifulsoup4==4.4.1
blinker==1.3
Brlapi==0.6.4
cached-property==1.5.1
chardet==3.0.4
checkbox-support==0.22
command-not-found==0.3
cryptography==1.2.3
defer==1.0.6
devscripts===2.16.2ubuntu3
et-xmlfile==1.0.1
feedparser==5.1.3
fs==2.4.11
guacamole==0.9.2
html5lib==0.999
httplib2==0.9.1
idna==2.0
jdcal==1.4.1
Jinja2==2.8
language-selector==0.1
louis==2.6.4
lxml==3.5.0
Magic-file-extensions==0.2
Mako==1.0.3
MarkupSafe==0.23
mzml2isa==1.0.3
oauthlib==1.0.3
onboard==1.2.0
openpyxl==2.6.4
padme==1.1.1
pexpect==4.0.1
Pillow==3.1.2
plainbox==0.25
pronto==0.12.2
ptyprocess==0.5
pyasn1==0.1.9
pycups==1.9.73
pycurl==7.43.0
pygobject==3.20.0
PyJWT==1.3.0
pyparsing==2.0.3
python-apt==1.1.0b1+ubuntu0.16.4.8
python-debian==0.1.27
python-systemd==231
pytz==2019.3
pyxdg==0.25
reportlab==3.3.0
requests==2.9.1
sessioninstaller==0.0.0
six==1.14.0
ssh-import-id==5.5
system-service==0.3
typing==3.7.4.1
ubuntu-drivers-common==0.0.0
ufw==0.35
unattended-upgrades==0.1
unity-scope-calculator==0.1
unity-scope-chromiumbookmarks==0.1
unity-scope-colourlovers==0.1
unity-scope-devhelp==0.1
unity-scope-firefoxbookmarks==0.1
unity-scope-gdrive==0.7
unity-scope-manpages==0.1
unity-scope-openclipart==0.1
unity-scope-texdoc==0.1
unity-scope-tomboy==0.1
unity-scope-virtualbox==0.1
unity-scope-yelp==0.1
unity-scope-zotero==0.1
urllib3==1.13.1
usb-creator==0.3.0
vboxapi==1.0
virtualenv==15.0.1
xdiagnose==3.8.4.1
xkit==0.0.0
XlsxWriter==0.7.3

So I am happy, let's just keep this in case someone else reports sth similar. Yours, Steffen

althonos commented 3 years ago

@althonos - Do you remember if we are storing the binary data when we parse?

I remember there were some updates that improved memory usage (not including the unmerged branch from 2017)

We are bound to have it stored in memory at some point as long as we build an element tree with lxml.etree.parse, since there is no way to ask the parser to discard some elements yet. (I can work on that though I think I now know how to handle that properly)

(The lines you refer to are actually there to extract metadata about the type of binary array: raw ion mobility array, wavelength array, etc.)