pymzml / pymzML

pymzML - an interface between Python and mzML Mass spectrometry Files
https://pymzml.readthedocs.io/en/latest/
MIT License
158 stars 91 forks source link

Iterator reinitializes itself and the end of iteration leading to pointless re indexing. #299

Closed arabidopsis closed 1 year ago

arabidopsis commented 2 years ago

In run.py:Reader.__next__ method the 'END' event triggers a re-opening of the file leading to another reindexing of a possibly very large file (if it contains no index).

Since we have already run through the file this is quite likely pointless.

If I want to run through the file again I'll re-open it myself.

If you want to be smarter about this, then just a seek(0) on the underlying file pointer would be better. Possibly a reset method on the underlying interface(s).

Plus the original underlying file pointer in not explicitly closed before the new one is created. (It will be -- eventulally -- with garbage collection but ... bad form).

MKoesters commented 2 years ago

Hi @arabidopsis,

Thanks for reporting this. The reinitialization was implemented due to a user request after some internal discussion. But you are right, reindexing the whole thing after iterating does not make sense, setting the pointer to the beginning is indeed smarter. I'll change this as soon as I have time and also fix the problem with the open pointer!

Best, Manuel

MKoesters commented 1 year ago

Should be solved with #307 We still reset the iterator, but there is no reindexing performed after reset. If you think your issue was not addressed as it should be, feel free to open it again.

Best, Manuel