evanyeyeye / rainbow

Read chromatography and mass spectrometry binary files.
GNU Lesser General Public License v3.0
30 stars 16 forks source link

Issue with Agilent Masshunter HRMS LZF decompression #27

Open shildebr12 opened 2 months ago

shildebr12 commented 2 months ago

I'm trying to use rainbow to extract HRMS spectra from an Agilent ESI Q-TOF. I run into a problem with the decompression of the MSProfile.bin file though where the lzf decompression fails. The data that it is failing on is here: https://drive.google.com/drive/folders/1EvGi1q8owphN-IBouC_7ptvF7xORJEGz?usp=sharing

The error is:

" File "C:\Users\hildebrs\Anaconda3\lib\site-packages\rainbow\agilent\masshunter.py", line 164, in parse_msdata decomp_bytes = lzf.decompress(comp_bytes, decomp_len)

ValueError: error in compressed data "

I'm using python-lzf 0.2.6

Thank you for the rainbow, by the way, it's been incredibly useful for other purposes!

ekwan commented 2 months ago

Hi! HRMS support is experimental and is the next feature we're planning to tackle seriously. We will take a look at this!

shildebr12 commented 2 months ago

After looking into it some more it seems like the MSProfile.bin files aren't compressed for me which would explain why the decompression isn't working. I just need to figure out how to get the information I need.

If it's useful information for you, this data is coming from an Agilent 6530 Accurate-Mass Q-TOF LC/MS

jlw387 commented 2 months ago

Hello, and thank you for the follow up!

We've recreated the error from the data you've provided, and it seems to be a non-trivial problem to address, namely that the assumptions currently being made about MSProfile.bin files are not true for your machine's data. Simply removing the decompression step and trying to parse the file results in a different error.

We will keep working on it on our end and will comment on this thread with any updates we have!

jlw387 commented 2 months ago

Hello again!

We're still working on handling this type of data for rainbow, but in the meantime, we were able to parse and view it correctly with ProteoWizard. You may want to use that software in the meantime until we can add support for your file structure.

In case you still need the parsed information and don't have the ability to run ProteoWizard on your system, I have uploaded the data converted to the mzML format to this shared folder. This format can be viewed in the open source cross-platform software OpenMS.

Thank you again for bringing the issue to our attention and for your patience!