adelabriere / SLAW

GNU General Public License v2.0
1 stars 1 forks source link

ms2_id and num_clustered_ms2 missing w/ CENTWAVE #5

Open stolltho opened 3 years ago

stolltho commented 3 years ago

Hi Alexis

ms2_id and num_clustered_ms2 columns are missing in all csv output files when CENTWAVE is used, present however when openMS is used.

Also, I don't understand how the ms2_id information is linked to fused_mgf. E.g. Running your demo data, if a feature has a ms2id value of "295(e27.5)". What is 295 and where do I find it in the fused_mgf?

Thanks, Thomas

adelabriere commented 3 years ago

295 corresponds to the position of the corresponding msms in the mgf and (e27.5) corresponds to he energy. If there is multiples energy they are split. I ll try to reproduce the centwave bug.

adelabriere commented 3 years ago

I can t reproduce the bug so it seems related to your mzML file or the setup, @stolltho would you be kind enough to run slaw in debug mode (docker run -rm -e LOGGING=DEBUG -v your_input_path:/input -v your_output_path:/output slaw) and paste the output ? Transmitting me an .mzML and the parameters would also be useful.

stolltho commented 3 years ago

Hi Alexis

I can't re-produce it either. This time I was running a smaller data set (20 files vs. 350 previously) in debug mode (docker run --rm -e LOGGING=DEBUG -v C:\SLAW\AV_input:/input -v C:\SLAW\AV_output:/output adelabriere/slaw:latest). See output below.

Cheers, Thomas

PS N:> docker run --rm -e LOGGING=DEBUG -v C:\SLAW\AV_input:/input -v C:\SLAW\AV_output:/output adelabriere/slaw:latest 2021-07-22|03:21:26|INFO: Total memory available: 100956 and 16 cores. The workflow will use 1257 Mb by core on 15 cores. 2021-07-22|03:21:26|INFO: Guessing polarity from file:G_1_1r.mzML 2021-07-22|03:21:30|INFO: Polarity detected: positive 2021-07-22|03:21:31|INFO: STEP: initialisation TOTAL_TIME:5.46s LAST_STEP:5.46s 2021-07-22|03:21:33|INFO: 12 peakpicking to do. 2021-07-22|03:22:57|DEBUG: Create profile matrix with method 'bin' and step 1 ... OK Detecting mass traces at 21.3114 ppm ... OK Detecting chromatographic peaks in 38472 regions of interest ... OK: 7134 found. 2021-07-22|03:23:02|DEBUG: Create profile matrix with method 'bin' and step 1 ... OK Detecting mass traces at 21.3114 ppm ... OK Detecting chromatographic peaks in 38819 regions of interest ... OK: 7259 found. 2021-07-22|03:23:03|DEBUG: Create profile matrix with method 'bin' and step 1 ... OK Detecting mass traces at 21.3114 ppm ... OK Detecting chromatographic peaks in 39335 regions of interest ... OK: 7198 found. 2021-07-22|03:23:32|DEBUG: Create profile matrix with method 'bin' and step 1 ... OK Detecting mass traces at 21.3114 ppm ... OK Detecting chromatographic peaks in 53499 regions of interest ... OK: 15280 found. 2021-07-22|03:23:45|DEBUG: Create profile matrix with method 'bin' and step 1 ... OK Detecting mass traces at 21.3114 ppm ... OK Detecting chromatographic peaks in 56848 regions of interest ... OK: 16576 found. 2021-07-22|03:23:48|DEBUG: Create profile matrix with method 'bin' and step 1 ... OK Detecting mass traces at 21.3114 ppm ... OK Detecting chromatographic peaks in 58662 regions of interest ... OK: 16939 found. 2021-07-22|03:23:56|DEBUG: Create profile matrix with method 'bin' and step 1 ... OK Detecting mass traces at 21.3114 ppm ... OK Detecting chromatographic peaks in 59607 regions of interest ... OK: 19053 found. 2021-07-22|03:24:00|DEBUG: Create profile matrix with method 'bin' and step 1 ... OK Detecting mass traces at 21.3114 ppm ... OK Detecting chromatographic peaks in 62762 regions of interest ... OK: 20624 found. 2021-07-22|03:24:04|DEBUG: Create profile matrix with method 'bin' and step 1 ... OK Detecting mass traces at 21.3114 ppm ... OK Detecting chromatographic peaks in 64009 regions of interest ... OK: 21443 found. 2021-07-22|03:24:07|DEBUG: Create profile matrix with method 'bin' and step 1 ... OK Detecting mass traces at 21.3114 ppm ... OK Detecting chromatographic peaks in 66839 regions of interest ... OK: 21861 found. 2021-07-22|03:24:07|DEBUG: Create profile matrix with method 'bin' and step 1 ... OK Detecting mass traces at 21.3114 ppm ... OK Detecting chromatographic peaks in 63875 regions of interest ... OK: 21507 found. 2021-07-22|03:24:10|DEBUG: Create profile matrix with method 'bin' and step 1 ... OK Detecting mass traces at 21.3114 ppm ... OK Detecting chromatographic peaks in 66289 regions of interest ... OK: 22283 found. 2021-07-22|03:24:44|DEBUG: Extracting all MS-MS spectra. 2021-07-22|03:24:44|DEBUG: Linking to: OpenSSL 1.1.1f 31 Mar 2020 MS-MS spectra extraction finished 2021-07-22|03:24:44|INFO: MS2 extraction finished 2021-07-22|03:24:44|INFO: STEP: peakpicking TOTAL_TIME:197.60s LAST_STEP:192.14s 2021-07-22|03:24:44|INFO: Aligning 2021-07-22|03:57:49|INFO: Filtering 2021-07-22|03:57:52|INFO: Extracting consensus MS-MS spectra 2021-07-22|03:59:01|DEBUG: Found 3526 features with associated MS-MS spectra Warning messages: 1: In seq_ms2_idx[pos_dm[o_dm_idx[first_spec:last_spec]] - firstLine + : number of items to replace is not a multiple of replacement length 2: In seq_num_ms2[pos_dm[o_dm_idx[first_spec:last_spec]] - firstLine + : number of items to replace is not a multiple of replacement length 2021-07-22|03:59:01|INFO: Alignment finished 2021-07-22|03:59:01|INFO: STEP: alignment TOTAL_TIME:2255.13s LAST_STEP:2057.53s |======================================================================| 100%

|======================================================================| 100%

|======================================================================| 100%

|======================================================================| 100%

|======================================================================| 100% 2021-07-22|04:00:29|INFO: Gap filling and isotopic pattern extraction finished. 2021-07-22|04:00:29|INFO: STEP: gap-filling TOTAL_TIME:2342.57s LAST_STEP:87.45s |======================================================================| 100%

|======================================================================| 100%

|======================================================================| 100%

|======================================================================| 100% 2021-07-22|04:05:31|DEBUG: Processing batch 1 Variables:1-5001current cliques size is 10000 Processing batch 2 Variables:2501-7501current cliques size is 592 Processing batch 3 Variables:5001-8267current cliques size is 975 Annotating Converting features. Warning message: In sink(NULL) : no sink to remove Building simplified data-matrix in 1 batch(es). Building full data-matrix in 1 batch(es). 2021-07-22|04:05:31|INFO: Annotation finished 2021-07-22|04:05:31|INFO: STEP: annotation TOTAL_TIME:2645.31s LAST_STEP:302.73s 2021-07-22|04:05:31|INFO: Processing finished.

adelabriere commented 3 years ago

Thanks, still there is something not normal in this debug output: (Warning messages: 1: In seq_ms2_idx[pos_dm[o_dm_idx[first_spec:last_spec]] - firstLine + : number of items to replace is not a multiple of replacement length 2: In seq_num_ms2[pos_dm[o_dm_idx[first_spec:last_spec]] - firstLine + : number of items to replace is not a multiple of replacement length)

I ll investigate.