lazear / sage

Proteomics search & quantification so fast that it feels like magic
https://sage-docs.vercel.app
MIT License
209 stars 38 forks source link

Sage for iTRAQ data #142

Closed bynjh007 closed 1 month ago

bynjh007 commented 1 month ago

Thanks for this powerful tool,

Is this tool also available for iTRAQ labeling data? I am planning to use this tool to reanalyse my previous data (iTRAQ-global, phospho, glycoproteome data).

Best, Jin

lazear commented 1 month ago

Hi Jin,

iTRAQ should work, but you will have to manually list the reporter ions, like below. Also note that a tmt.tsv file will still be written, and the channel names will be tmt_1, tmt_2, ...

Please report any issues you find!

"quant": {
    "tmt": {
      "User": [
        114,
        115,
        116,
        117
      ]
    },
    "tmt_settings": {
      "level": 3,
      "sn": false
    }
}
bynjh007 commented 1 month ago

Thanks for your guidance! I have one more question. Is there a parameter for mass tolerance for ITRAQ (or TMT) reporter ions?

lazear commented 1 month ago

At the moment it is hard-coded to 20 ppm. I am open to reifying it as a full user-modified parameter if needed!

bynjh007 commented 1 month ago

At the moment it is hard-coded to 20 ppm. I am open to reifying it as a full user-modified parameter if needed! I think this would be a very useful option!

According to your help, now I generated a json file for testing one of my data (MS-spec: Q Exactive Orbitrap). But I keep facing runtime error: thread '' has overflowed its stack fatal runtime error: stack overflow

I noticed that "max_len 20" works, but "max_len 30" causes that error. But I need at least 40. Would you recommend how this can be solved?

Thanks!

json file: { "database": { "bucket_size": 16384, "enzyme": { "missed_cleavages": 2, "min_len": 7, "max_len": 50, "cleave_at": "KR", "restrict": "P", "c_terminal": true, "semi_enzymatic": true }, "fragment_min_mz": 150.0, "fragment_max_mz": 1500.0, "peptide_min_mass": 500.0, "peptide_max_mass": 5000.0, "ion_kinds": [ "b", "y" ], "min_ion_index": 2, "max_variable_mods": 2, "static_mods": { "C": 57.0214, "^": 144.1021, "K": 144.1021 }, "variable_mods": { "M": 15.9949, "N": 0.9840, "Q": 0.9840 }, "decoytag": "rev", "generate_decoys": true, "fasta": "/home/reference/uniprot/human_contam_20240723.fasta" }, "precursor_tol": { "ppm": [-10,10] }, "fragment_tol": { "ppm": [-10,10] }, "precursor_charge": [2, 4], "isotope_errors": [0,0], "wide_window": false, "chimera": false, "report_psms": 5,

"deisotope": true,
"min_peaks": 15,
"max_peaks": 150,
"min_matched_peaks": 4,
"max_fragment_charge": 1,

"quant": {
    "tmt": {
        "User": [114,115,116,117]
    },
    "tmt_settings": {
        "level": 2,
        "sn": false
    }
},
"mzml_paths": [
    "/home/bynjh007/eogc/global/N111T112.mzML",
]

}

lazear commented 1 month ago

Your configuration will likely require 1+ TB of RAM to successfully complete. I strongly recommend the following alterations:

bynjh007 commented 1 month ago

Your configuration will likely require 1+ TB of RAM to successfully complete. I strongly recommend the following alterations:

  • Do not use semi-enzymatic search
  • Remove N/Q deamidation variable modifications
  • Widen your precursor tolerance to "da": [-3.5, 1.25]. This will encompass isotope errors (which you are not using) and deamidation.

Thanks for the suggestion. turning off the "semi-enzymatic search" mode only make it work!

However, in my tmt.tsv, most of values in tmt1_4 are 0 (also ion_injection_time=0), which means that it didn't properly capture iTRAQ reporter ion.(114, 115, 116, 117).I also changed fragment_min_mz to 100, but it didn't solve the problem.

Could you help me how I can solve this and obtain the proper iTRAQ values?

Thanks!

lazear commented 1 month ago

If you're using the most recent release, fragment_min_mz should be automatically accounted for when using MS2-based reporter ion quant. The most likely oversight is trimming out the reporter ion peaks if they aren't in the top 150 by intensity. Can you try setting the max_peaks parameter to something like 500 or 1000 and see if that resolves the issue?

bynjh007 commented 1 month ago

If you're using the most recent release, fragment_min_mz should be automatically accounted for when using MS2-based reporter ion quant. The most likely oversight is trimming out the reporter ion peaks if they aren't in the top 150 by intensity. Can you try setting the max_peaks parameter to something like 500 or 1000 and see if that resolves the issue?

Yes, I am using the v0.14.7 (from bioconda). I increased max_peaks to 1000, and the number of non-zero itraq values are slightly increased but still, majorities are zero. Also, non-zero itraq values were only detected in one of the four reporter ions (either 114 or 115 or 116 or 117) like this. image

Here are my json: { "database": { "bucket_size": 8192, "enzyme": { "missed_cleavages": 2, "min_len": 7, "max_len": 50, "cleave_at": "KR", "restrict": "P", "c_terminal": true, "semi_enzymatic": false }, "fragment_min_mz": 100.0, "fragment_max_mz": 1500.0, "peptide_min_mass": 500.0, "peptide_max_mass": 5000.0, "ion_kinds": [ "b", "y" ], "min_ion_index": 2, "max_variable_mods": 2, "static_mods": { "C": 57.0214, "^": 144.1021, "K": 144.1021 }, "variable_mods": { "M": 15.9949, "N": 0.9840, "Q": 0.9840 }, "decoytag": "rev", "generate_decoys": true, "fasta": "/home/reference/uniprot/human_contam_20240723.fasta" }, "precursor_tol": { "ppm": [-10,10] }, "fragment_tol": { "ppm": [-10,10] }, "precursor_charge": [2, 4], "isotope_errors": [0,0], "wide_window": false, "chimera": false, "report_psms": 5,

"deisotope": true,
"min_peaks": 15,
"max_peaks": 1000,
"min_matched_peaks": 4,
"max_fragment_charge": 1,

"quant": {
    "tmt": {
        "User": [114,115,116,117]
    },
    "tmt_settings": {
        "level": 2,
        "sn": false
    }
},
"mzml_paths": [
   "/home/bynjh007/eogc/global/N111T112.mzML"
  ]

}

lazear commented 1 month ago

Ah, you'll need to actually find and use the real (e.g. high resolution) iTRAQ reporter ion m/z values - I just put dummy ones in there :)

bynjh007 commented 1 month ago

You mean [114, 115, 116, 117] in quant:tmt:user? I assumed that iTRAQ 4-plex reporter ion m/z is also 114-117?

lazear commented 1 month ago

Yes, fill in the reporter ion mz values with more significant figures. E.g. For TMT126 you would use 126.127726, not just 126