gtluu / timsconvert

https://gtluu.github.io/timsconvert/
Apache License 2.0
28 stars 16 forks source link

parameter --imzml_mode continuous doesn't make a difference #63

Closed animesh closed 3 weeks ago

animesh commented 1 month ago

I have tried with and without --imzml_mode continuous but no difference in output it seems

ls -ltrh /cluster/work/users/ash022/240605_200ngHelaQC_DDAlong_Slot1-54_1_7643.mzML /cluster/work/users/ash022/mzML/240605_200ngHelaQC_DDAlong_Slot1-54_1_7643.mzML
-rw-rw-r-- 1 ash022 ash022 4.7G Jul 18 19:45 /cluster/work/users/ash022/mzML/240605_200ngHelaQC_DDAlong_Slot1-54_1_7643.mzML
-rw-rw-r-- 1 ash022 ash022 4.7G Jul 25 19:21 /cluster/work/users/ash022/240605_200ngHelaQC_DDAlong_Slot1-54_1_7643.mzML
diff /cluster/work/users/ash022/mzML/240605_200ngHelaQC_DDAlong_Slot1-54_1_7643.mzML /cluster/work/users/ash022/240605_200ngHelaQC_DDAlong_Slot1-54_1_7643.mzML
5c5
<       <cv fullName="PSI-MS" URI="http://purl.obolibrary.org/obo/ms/psi-ms.obo" version="4.1.160" id="PSI-MS"/>
---
>       <cv fullName="PSI-MS" URI="http://purl.obolibrary.org/obo/ms/psi-ms.obo" version="4.1.163" id="PSI-MS"/>
10d9
<         <cvParam cvRef="PSI-MS" accession="MS:1000579" name="MS1 spectrum" value=""/>
11a11
>         <cvParam cvRef="PSI-MS" accession="MS:1000579" name="MS1 spectrum" value=""/>
9507786c9507786
<   <fileChecksum>7e110bc7cb70c9917bd9b751bdcf252c2f154f70</fileChecksum>
---
>   <fileChecksum>fef85748fa23743578236b4a9796764b2bb8c10e</fileChecksum>

is continuous the default mode

timsconvert --chunk_size 10000000 --imzml_mode continuous --verbose --input /cluster/work/users/ash022/veronica/240605_200ngHelaQC_DDAlong_Slot1-54_1_7643.d
2024-07-25T17:15:10.964476:Initialize Bruker .dll file...
2024-07-25T17:15:11.554368:Loading input data...
2024-07-25T17:15:11.554655:Reading file: /cluster/work/users/ash022/veronica/240605_200ngHelaQC_DDAlong_Slot1-54_1_7643.d
2024-07-25T17:18:45.673998:input: /cluster/work/users/ash022/veronica/240605_200ngHelaQC_DDAlong_Slot1-54_1_7643.d
2024-07-25T17:18:45.674202:outdir: /cluster/work/users/ash022/veronica
2024-07-25T17:18:45.674281:outfile:
2024-07-25T17:18:45.674353:mode: centroid
2024-07-25T17:18:45.674434:compression: zlib
2024-07-25T17:18:45.674521:ms2_only: False
2024-07-25T17:18:45.674599:exclude_mobility: False
2024-07-25T17:18:45.674661:encoding: 64
2024-07-25T17:18:45.674722:barebones_metadata: False
2024-07-25T17:18:45.674783:profile_bins: 0
2024-07-25T17:18:45.674842:maldi_output_file: combined
2024-07-25T17:18:45.674901:maldi_plate_map:
2024-07-25T17:18:45.674960:imzml_mode: continuous
2024-07-25T17:18:45.675018:chunk_size: 10000000
2024-07-25T17:18:45.675078:verbose: True
2024-07-25T17:18:45.675136:version: 1.6.5
2024-07-25T17:18:45.675196:infile: /cluster/work/users/ash022/veronica/240605_200ngHelaQC_DDAlong_Slot1-54_1_7643.d
2024-07-25T17:18:45.675259:.tdf file detected...
2024-07-25T17:18:45.675335:Processing LC-TIMS-MS data...
2024-07-25T17:18:45.675404:Initializing mzML Writer...
2024-07-25T17:18:46.879515:Initializing controlled vocabularies...
2024-07-25T17:18:47.289197:Writing mzML metadata...
2024-07-25T17:18:47.299814:Writing data to .mzML file /cluster/work/users/ash022/veronica/240605_200ngHelaQC_DDAlong_Slot1-54_1_7643.mzML...
2024-07-25T17:18:47.300075:Calculating number of spectra...
2024-07-25T17:18:47.359149:Parsing and writing Frame 1...
2024-07-25T19:21:57.842307:Renaming mzML file...
2024-07-25T19:21:57.846395:Finished writing to .mzML file /cluster/work/users/ash022/veronica/240605_200ngHelaQC_DDAlong_Slot1-54_1_7643.mzML...

even if specified otherwise

timsconvert --chunk_size 5000000000 --verbose /cluster/work/users/ash022/veronica/240605_200ngHelaQC_DDAlong_Slot1-54_1_7643.d
2024-07-18T18:01:02.147979:Initialize Bruker .dll file...
2024-07-18T18:01:03.562431:Loading input data...
2024-07-18T18:01:03.562633:Reading file: /cluster/work/users/ash022/veronica/240605_200ngHelaQC_DDAlong_Slot1-54_1_7643.d
2024-07-18T18:04:02.166074:input: /cluster/work/users/ash022/veronica/240605_200ngHelaQC_DDAlong_Slot1-54_1_7643.d
2024-07-18T18:04:02.166245:outdir: /cluster/work/users/ash022/veronica
2024-07-18T18:04:02.166319:outfile:
2024-07-18T18:04:02.166385:mode: centroid
2024-07-18T18:04:02.166449:compression: zlib
2024-07-18T18:04:02.166529:ms2_only: False
2024-07-18T18:04:02.166610:exclude_mobility: False
2024-07-18T18:04:02.166668:encoding: 64
2024-07-18T18:04:02.166723:barebones_metadata: False
2024-07-18T18:04:02.166779:profile_bins: 0
2024-07-18T18:04:02.166833:maldi_output_file: combined
2024-07-18T18:04:02.166887:maldi_plate_map:
2024-07-18T18:04:02.166941:imzml_mode: processed
2024-07-18T18:04:02.166994:chunk_size: 10000000
2024-07-18T18:04:02.167049:verbose: True
2024-07-18T18:04:02.167103:version: 1.6.5
2024-07-18T18:04:02.167157:infile: /cluster/work/users/ash022/veronica/240605_200ngHelaQC_DDAlong_Slot1-54_1_7643.d
2024-07-18T18:04:02.167215:.tdf file detected...
2024-07-18T18:04:02.167283:Processing LC-TIMS-MS data...
2024-07-18T18:04:02.167346:Initializing mzML Writer...
2024-07-18T18:04:03.407497:Initializing controlled vocabularies...
2024-07-18T18:04:03.834138:Writing mzML metadata...
2024-07-18T18:04:03.843271:Writing data to .mzML file /cluster/work/users/ash022/veronica/240605_200ngHelaQC_DDAlong_Slot1-54_1_7643.mzML...
2024-07-18T18:04:03.843458:Calculating number of spectra...
2024-07-18T18:04:03.893328:Parsing and writing Frame 1...
2024-07-18T19:45:06.929689:Renaming mzML file...
2024-07-18T19:45:06.931902:Finished writing to .mzML file /cluster/work/users/ash022/veronica/240605_200ngHelaQC_DDAlong_Slot1-54_1_7643.mzML...

or based on input, it gets fixed?

gtluu commented 1 month ago

Hi @animesh apologies if the documentation was unclear. Since you are working with LC-MS data, the data is converted to an mzML file. The --imzml_mode parameter is meant for mass spectrometry imaging datasets from the MALDI source on the timsTOF fleX which produces imzML files instead of mzML. Therefore, for your datasets, it makes no difference what you set this parameter to. I will work on updating the documentation to explicitly mention this.

animesh commented 1 month ago

my bad, i though "i" was for ion-mobility 🤪 how is timsconvert dealing with that BTW?

gtluu commented 3 weeks ago

TIMSCONVERT handles ion mobility data by essentially adding a third binary data array for the 1/K0 values. In these mzML/imzML files, the 3 binary data arrays are derived from a long dataframe containing the m/z array, intensity array, and 1/K0 array that are pulled using the TDF-SDK.