OpenChrom / openchrom

Visualization and Analysis of mass spectrometric and chromatographic data.
https://www.openchrom.net
Eclipse Public License 1.0
82 stars 24 forks source link

very large MSD .raw files are difficult to work with #437

Closed AnnaClo closed 6 months ago

AnnaClo commented 8 months ago

Hi, I have a question about MSD .RAW files obtained from a Thermo Fisher GC-MS. These .raw files are very large and difficult to work with. We tried running a (batch) method and also converting the .raw file into .ocb (export chromatogram > Open chromatography binary .ocb), but both processes take a very long time. Is there a way to bypass this problem, e.g. another way for converting the files into a smaller format ?

eselmeister commented 8 months ago

@AnnaClo Could you provide a few chromatograms for testing purposes?

AnnaClo commented 8 months ago

Yes sure, I send here one injection, as a we transfer because the file is too large

https://we.tl/t-Jn1PGpPzfp

eselmeister commented 8 months ago

Do you expect that your chromatogram looks like this?

Screenshot from 2023-11-29 11-34-18

I was able to pre-process it and reduce the size from 1.3 GB down to 7.4 MB. It can be done with the process methods.

Screenshot from 2023-11-29 11-42-14

Condensed Chromatogram: BAME_std1A 1 v1.zip

AnnaClo commented 8 months ago

Yes, that's how I expect the chromatogram to look like. In the v1 file (.ocb) appears an empty chromatogram to me right now. This is something that happened with more files, after processing the raw file, and saving the result as ocb, that turns out to be empty.

The problem we have is that the preprocessing itself takes very long (e.g. baseline subtraction filter takes already a lot of time). Is there a preprocessing step that allows to reduce the size of the file that we can use first of all other ones?

image

eselmeister commented 8 months ago

Do you use the latest version of OpenChrom? Could you share the version number: Menu > Help > About

AnnaClo commented 8 months ago

Here it is: version 1.5.0 image

eselmeister commented 8 months ago

All right, we have updated the *.ocb format recently: https://github.com/eclipse/chemclipse/issues/1326

The *.ocb version I have uploaded uses the latest version 1.5.0.1 (McLafferty v2). Could you try to open it with the latest OpenChrom release: https://openchrom.net/download

AnnaClo commented 8 months ago

OK, indeed I managed to open the .ocb in with the latest version of openchrom. Once I convert the .raw in .ocb, the pre-processing steps seem to take a reasonable time, but it takes a lot of time either converting the .raw in .ocb, or performing the pre-processing on the raw file. Is there any step that can get around that?

eselmeister commented 8 months ago

At a certain point, the high-res data needs to be compressed to nominal mass. If your machine has enough RAM, you could setup a batch process and converter the data over night.

AnnaClo commented 6 months ago

Ok, thank you for helping. Indeed, we managed it by letting the batch process run for several hours.

eselmeister commented 6 months ago

You're welcome.