Tarskin / LaCyTools

A high-throughput data extraction package for LC-MS data.
Apache License 2.0
9 stars 5 forks source link

HD5 Support #2

Closed Tarskin closed 2 years ago

Tarskin commented 8 years ago

Integrate HD5 support in LaCyTools

genadijrazdorov commented 8 years ago

tested only on small set of samples

Tarskin commented 8 years ago
  1. Selecting 'Batch convert to pytables' while having a pyTables file selected gives an ugly exception in the console.
  2. Selecting only conversion in the batchProcess window and then doing 'run batch process' does not actually convert the files already (would be a useful feature if interested only in storing data in a more efficient manner).
  3. Perhaps we should remove the pyTables file button and have the program look for the pytables.h5 file in the batch directory? This would improve the graphical appeal of the feature, I think.
magnuspalmblad commented 8 years ago

Why are we (re)implementing this? Or more precisely: how is this different from mz5 from the Steen lab, which is already in ProteoWizard?

genadijrazdorov commented 8 years ago

mz5 is replicating xml formats with hd5 library, and it doesn't use pytables, which is more then just a python wrapper for hd5, and it doesn't use blosc, an extremely fast, multi-threaded, meta-compressor library. LacyTools is using both libraries and not just replicating xml ways, so it is fast and uses only 1/3 of disk space.

magnuspalmblad commented 8 years ago

OK - I (genuinely) look forward to a side-by-side comparison! There are so many ways to put the data in mzML too (just looking at msconvert), which also have a huge effect on size. But I suspect you are aware of these!

Will you be at ASMS or IMSC? Otherwise, perhaps I will see you in Leiden sometime?

genadijrazdorov commented 8 years ago

Fair comparison is absolutely in our plans.

This is my last day in Leiden. I will probably not be at the ASMS or IMSC this year.