nf-core / metaboigniter

Pre-processing of mass spectrometry-based metabolomics data with quantification and identification based on MS1 and MS2 data.
https://nf-co.re/metaboigniter
MIT License
16 stars 15 forks source link

negative run error #80

Closed YoujiaMa closed 7 months ago

YoujiaMa commented 10 months ago

Description of the bug

i try a lot and always meet the same error message when run negative data

Command used and terminal output

Command error:
  Adding neutral: ---------- Adduct -----------------
  Charge: 0
  Amount: 1
  MassSingle: -18.0106
  Formula: H-2O-1
  log P: -2.99573

  Adding neutral: ---------- Adduct -----------------
  Charge: 0
  Amount: 1
  MassSingle: 46.0055
  Formula: C1H2O2
  log P: -0.693147

  MassExplainer table size: 4
  Error: Unexpected internal error (WARNING!!! implicit number of default adduct is negative!!! left:-1 right: -1
  )
  Generating Masses with threshold: -2.99573 ...
  done

Relevant files

No response

System information

No response

YoujiaMa commented 10 months ago

i setting skip_adduct_detection=true and finish the nextflow pipeline. but my SIRIUS results have a lot of positive adduct, it seems unbelievable. i find that Linked_data_part1.mgf and Linked_data_part1.ms have different CHARGE in NFCORE_METABOIGNITER:METABOIGNITER:IDENTIFICATION:PYOPENMS_GENERATESEARCHPARAMS

$ grep CHARGE Linked_data_part1.mgf |head CHARGE=1- CHARGE=1- CHARGE=1- CHARGE=1- CHARGE=1- CHARGE=1- CHARGE=1- CHARGE=1- CHARGE=1-

$ grep charge Linked_data_part1.ms |head

charge 1 charge 1 charge 1 charge 1 charge 1 charge -1 charge 1 charge 1 charge 1 charge 1

PayamEmami commented 10 months ago

Hello, I'm looking into this now. Is it possible for you to share the settings you used to run the pipeline?

YoujiaMa commented 10 months ago

the parameters is same as full_test. only polarity set "negative". identification= false polarity="negative" ms2_collection_model="separate" run_sirius = true sirius_split = true mgf_splitmgf_pyopenms = 100 run_ms2query = true requantification = true

PayamEmami commented 10 months ago

Thanks. Found the problem. working on a fix for it. should be in the dev branch very soon.

YoujiaMa commented 10 months ago

https://github.com/nf-core/metaboigniter/blob/master/bin/generate_ms_params.py#L406

I think it have some error if charge == 1 and polarity == "negative". charge_f will output 1

PayamEmami commented 10 months ago

yes. exactly. this must be replaced with

charge_f = (1 if polarity == "positive" else -1) * (1 if charge_f == 0 else abs(charge_f))
PayamEmami commented 10 months ago

regarding

Command error:
  Adding neutral: ---------- Adduct -----------------
  Charge: 0
  Amount: 1
  MassSingle: -18.0106
  Formula: H-2O-1
  log P: -2.99573

  Adding neutral: ---------- Adduct -----------------
  Charge: 0
  Amount: 1
  MassSingle: 46.0055
  Formula: C1H2O2
  log P: -0.693147

  MassExplainer table size: 4
  Error: Unexpected internal error (WARNING!!! implicit number of default adduct is negative!!! left:-1 right: -1
  )
  Generating Masses with threshold: -2.99573 ...
  done

you can set the adducts_neg to something like "H-1:-:0.8 H-3O-1:-:0.2". We will later change the defaults.

YoujiaMa commented 10 months ago

Thanks. i try it

YoujiaMa commented 9 months ago

hi metaboigniter team,

i find that it will have long path name when we run a lot of sample (such as test data)

ls results/alignment_mzml/
'[X2_Rep1, X3_Rep1, X6_Rep1, Pilot_MS_Control_2_peakpicked, Pilot_MS_Pool_2_peakpicked]'

https://github.com/nf-core/metaboigniter/blob/master/conf/modules.config#L175

PayamEmami commented 9 months ago

Thanks. it will be addressed.

YoujiaMa commented 8 months ago

hi metaboigniter team, i try 41 MS12 sample and met a new question

[14/854763] process > NFCORE_METABOIGNITER:METABOIGNITER:QUANTIFICATION:OPENMS_FEATUREFINDERMETABO (N29_neg_IS)          [100%] 41 of 41, cached: 41 ✔
[26/a88e18] process > NFCORE_METABOIGNITER:METABOIGNITER:QUANTIFICATION:OPENMS_MAPALIGNERPOSECLUSTERING (Multiple files) [100%] 1 of 1, cached: 1 ✔
[da/4f406d] process > NFCORE_METABOIGNITER:METABOIGNITER:QUANTIFICATION:OPENMS_MAPRTTRANSFORMER (G27_neg_IS)             [100%] 39 of 39, cached: 37 ✔

i find it OPENMS_MAPRTTRANSFORMER only run 39 sample i resume twice ,OPENMS_MAPRTTRANSFORMER restart with different sample num

[82/02db78] process > NFCORE_METABOIGNITER:METABOIGNITER:QUANTIFICATION:OPENMS_FEATUREFINDERMETABO (N69_neg_IS)          [100%] 41 of 41, cached: 41 ✔
[26/a88e18] process > NFCORE_METABOIGNITER:METABOIGNITER:QUANTIFICATION:OPENMS_MAPALIGNERPOSECLUSTERING (Multiple files) [100%] 1 of 1, cached: 1 ✔
[d2/4b2173] process > NFCORE_METABOIGNITER:METABOIGNITER:QUANTIFICATION:OPENMS_MAPRTTRANSFORMER (N60_neg_IS)             [100%] 35 of 35, cached: 35 ✔
[33/ee8d44] process > NFCORE_METABOIGNITER:METABOIGNITER:QUANTIFICATION:OPENMS_FEATUREFINDERMETABO (G38_neg_IS)          [100%] 41 of 41, cached: 41 ✔
[26/a88e18] process > NFCORE_METABOIGNITER:METABOIGNITER:QUANTIFICATION:OPENMS_MAPALIGNERPOSECLUSTERING (Multiple files) [100%] 1 of 1, cached: 1 ✔
[bd/cb9a38] process > NFCORE_METABOIGNITER:METABOIGNITER:QUANTIFICATION:OPENMS_MAPRTTRANSFORMER (N14_neg_IS)             [100%] 36 of 36, cached: 35 ✔
PayamEmami commented 8 months ago

Thanks a lot for finding this bug.

https://github.com/nf-core/metaboigniter/blob/2f8f077d38fcacd2caef9590dc557ddcc17c78c6/subworkflows/local/quantification.nf#L61-L72

These have to be replaced by

 combined_data =quantified_features.map{meta,featurexml->
   tuple(featurexml.baseName, meta)}
    .join(OPENMS_MAPALIGNERPOSECLUSTERING.out.featurexml.map{it[1]}.flatten().map{featurexml ->
    tuple(featurexml.baseName, featurexml)})
     .join(OPENMS_MAPALIGNERPOSECLUSTERING.out.trafoxml.map{it[1]}.flatten().map{trafoxml ->
    tuple(trafoxml.baseName, trafoxml)})
    .join(quantificaiton_data.map{meta,mzml->
 tuple(mzml.baseName, mzml)}).map{it[1..4]}

I will have to check if there are other places that we should replace the grouping.

YoujiaMa commented 8 months ago

in NFCORE_METABOIGNITER:METABOIGNITER:IDENTIFICATION:PYOPENMS_MSMAPPING pipeline , sometime my neg mzML cant map with linked_data. ( I not sure that the charge data in my mzML is 1 , linked_data is -1 , so it cant map success ) can i add below to the code https://github.com/nf-core/metaboigniter/blob/master/bin/extract_mapping.py#L42

parameters.setValue("ignore_charge", "true")
YoujiaMa commented 8 months ago

and i find that my test link_data mz and rt value change site example:

<groupedElementList>
        <element map="0" id="1846111275860413" rt="540.393717669934404" mz="341.923878832888192" it="3.075653e04" charge="1"/>
        <element map="1" id="9952900658436292523" rt="548.849116174641267" mz="341.923931444177526" it="1.836938e05" charge="1"/>
        <element map="2" id="2972814306089124318" rt="540.956951552158557" mz="341.923878832888192" it="4.02439e05" charge="1"/>
</groupedElementList>
<PeptideIdentification identification_run_ref="PI_0" score_type="" higher_score_better="true" significance_threshold="0" MZ="540.393717669934" RT="341.923878832888" >
        <UserParam type="stringList" name="masstrace_intensity" value="[]"/>
        <UserParam type="stringList" name="masstrace_centroid_mz" value="[]"/>
        <UserParam type="floatList" name="isotope_calcdist_mz" value="[341.9241943359375]"/>
        <UserParam type="floatList" name="isotope_calcdist_int" value="[2613.7314453125]"/>
        <UserParam type="float" name="feature_quality" value="1.164749979972839"/>
        <UserParam type="float" name="normalized_intensity" value="8.045819655895059e-07"/>
        <UserParam type="int" name="map_index" value="0"/>

another question , if i run MS12 ,the mzml will use raw instead of rt correct one.

thanks for your reply

PayamEmami commented 7 months ago

and i find that my test link_data mz and rt value change site example:

<groupedElementList>
        <element map="0" id="1846111275860413" rt="540.393717669934404" mz="341.923878832888192" it="3.075653e04" charge="1"/>
        <element map="1" id="9952900658436292523" rt="548.849116174641267" mz="341.923931444177526" it="1.836938e05" charge="1"/>
        <element map="2" id="2972814306089124318" rt="540.956951552158557" mz="341.923878832888192" it="4.02439e05" charge="1"/>
</groupedElementList>
<PeptideIdentification identification_run_ref="PI_0" score_type="" higher_score_better="true" significance_threshold="0" MZ="540.393717669934" RT="341.923878832888" >
        <UserParam type="stringList" name="masstrace_intensity" value="[]"/>
        <UserParam type="stringList" name="masstrace_centroid_mz" value="[]"/>
        <UserParam type="floatList" name="isotope_calcdist_mz" value="[341.9241943359375]"/>
        <UserParam type="floatList" name="isotope_calcdist_int" value="[2613.7314453125]"/>
        <UserParam type="float" name="feature_quality" value="1.164749979972839"/>
        <UserParam type="float" name="normalized_intensity" value="8.045819655895059e-07"/>
        <UserParam type="int" name="map_index" value="0"/>

another question , if i run MS12 ,the mzml will use raw instead of rt correct one.

thanks for your reply

If you choose to perform the alignment the adjusted RT will be used. however, if you are running MS12 in separate mode, the alignment is based on raw spectra, in the paired mode the alignment is based on features.

Thanks for reporting the inconsistencies. I have now fixed it.

PayamEmami commented 7 months ago

these should have been fixed now. I'm closing the issue.