MeganEDuffy / FISH-546

repo for FISH 546: Bioninformatics for Environmental Sciences
0 stars 2 forks source link

Points on plan #3

Open sr320 opened 7 years ago

sr320 commented 7 years ago

My goal is to compare microbial metaproteomic depth profiles from both a traditional database searching strategy and de novo peptide sequencing. Ultimately I'll be comparing the numbers of peptide and proteins matched or sequenced (some quality control will have to come in here), and the resulting taxonomic and functional characterizations that each output leads to. My first steps are: 0) Obtain MS/MS spectra (I ran some samples last week and and was running more today but just decided to shut down the instrument in case we lose power...) Should be all done by next week. Once I have raw data (.RAW Waters file directories), I need to peak pick in Progenesis (Waters software) and convert to .mgf and .mzxML with msconvert. 1) Download search database (from assembled metagenome) from Rocap group server 2) Search .mzxML files with X!Tandem maybe via the TPP and run .mgf files with Novor CLI. 3) Figure out how to quality control the output - be it with PeptideProphet or something else. 4) Run peptide results through Unipept and MEGAN6for taxonomic and functional characterizations.

My take is you know your steps and it is just a matter of carrying them out? Is there any particular aspect that concerns you?

My sense is that these are all GUIs and one challenge will be documenting workflow in a reproducible manner.

MeganEDuffy commented 7 years ago

True, I wish there weren't so many GUIs. Some of them have CLIs, like the de novo peptide sequencing tool Novor and the conversion tool msconvert, and these will require checking the data formats after, which I excited to do correctly. I'm trying to figure out if I'll do this on my collaborator's clusters, which I'll have to learn how to do although I don't think will be difficult. Another concern that I'm trying to do some reading up on this week is using Peptide or ProteinProphet for determine the quality of my spectral matches. And then how to I compare that to de novo outputs? These are some thing I'm trying to figure out.