Open lomereiter opened 7 years ago
This seems a lot cleaner than what we hacked through at first. I'll review it as soon as I can.
Thanks @lomereiter, this is a great contribution.
No unit tests yet... but it seems to be passing the travis and Appveyor tests with no problem
Hey @althonos, do you think we should merge this now? or perhaps we should wait until we have the unit test functionality?
Well, since this passes the integration against MetaboLights, I'm positive about merging (I'm not sure how long it will take to setup unit tests, the feat-tests may be far behind master).
Maybe (because of the increased time) we should still leave both methods and let the user choose (like lxml.etree.iterparse
allows to give a huge_tree
parameter).
Yeah I think you are right @althonos, keeping both methods seems like the best idea as memory consumption might not be a problem for some.
This PR provides an alternative solution to #13: each scan is parsed once, all necessary information is extracted from it, then the node is freed. On a large imzML file this brought top memory consumption from 2.5GB down to 90MB, albeit the processing time increased from 6s to 13s.