Closed grace-ac closed 5 years ago
updated zip: 20180831-Cgseed
Don't get too excited, but I think I'll have time to work on this next week. I'm just going to run through making the Skyline document and see if I get something different from/better than (?) what you got. I'm assuming the updated zip file you posted above is from the Walnut workflow?
Okay!!
Both zip files are from the Walnut workflow, but the "updated" one is just the most recent attempt. I tried it out again last Friday and went through it step by step.
Here's a notebook post related to the 08-31 attempt: here
Which fasta file did you use in Walnut and in Skyline?
I've been using this fasta for everything: http://owl.fish.washington.edu/scaphapoda/grace/2015-oysterseed-project/2015-DIA/Cg_Giga_cont_prtc_AA.fasta
The above fasta came from Step 2c in the protocol (Steven did this for me).
$ cd Desktop/
srlab@swan MINGW64 ~/Desktop
$ cd grace/
srlab@swan MINGW64 ~/Desktop/grace
$ head Cg_Giga_cont_prtc_AA_digested_Mass400to6000.txt
Protein_Name Sequence Unique_ID Monoisotopic_Mass Predicte
d_NET Tryptic_Name
CHOYP_043R.5.5|m.64252 SPSEDPDAPIENILQTNSVYKPK 1 2541.2598016 0.3655t2
.1
CHOYP_043R.5.5|m.64252 SPSEDPDAPIENILQTNSVYKPKK 2 2669.35475980.34
14 t2.2
CHOYP_043R.5.5|m.64252 SPSEDPDAPIENILQTNSVYKPKKEPTYDENVVVK 3 3942.973
762 0.3449 t2.3
CHOYP_043R.5.5|m.64252 SPSEDPDAPIENILQTNSVYKPKKEPTYDENVVVKIISQDTPTILR 45180.67
6764 0.5144 t2.4
CHOYP_043R.5.5|m.64252 KEPTYDENVVVK 5 1419.7245246 0.2186 t3.2
CHOYP_043R.5.5|m.64252 KEPTYDENVVVKIISQDTPTILR 6 2657.4275266 0.4593t3
.3
CHOYP_043R.5.5|m.64252 KEPTYDENVVVKIISQDTPTILRVSFTVNR 7 3460.85649280.56
58 t3.4
CHOYP_043R.5.5|m.64252 EPTYDENVVVK 8 1291.6295664 0.2301 t4.1
CHOYP_043R.5.5|m.64252 EPTYDENVVVKIISQDTPTILR 9 2529.3325684 0.4402t4
.2
srlab@swan MINGW64 ~/Desktop/grace
$ awk '{print $1,$2}' Cg_Giga_cont_prtc_AA_digested_Mass400to6000.txt | head
Protein_Name Sequence
CHOYP_043R.5.5|m.64252 SPSEDPDAPIENILQTNSVYKPK
CHOYP_043R.5.5|m.64252 SPSEDPDAPIENILQTNSVYKPKK
CHOYP_043R.5.5|m.64252 SPSEDPDAPIENILQTNSVYKPKKEPTYDENVVVK
CHOYP_043R.5.5|m.64252 SPSEDPDAPIENILQTNSVYKPKKEPTYDENVVVKIISQDTPTILR
CHOYP_043R.5.5|m.64252 KEPTYDENVVVK
CHOYP_043R.5.5|m.64252 KEPTYDENVVVKIISQDTPTILR
CHOYP_043R.5.5|m.64252 KEPTYDENVVVKIISQDTPTILRVSFTVNR
CHOYP_043R.5.5|m.64252 EPTYDENVVVK
CHOYP_043R.5.5|m.64252 EPTYDENVVVKIISQDTPTILR
srlab@swan MINGW64 ~/Desktop/grace
$ awk '{print $1,$2}' Cg_Giga_cont_prtc_AA_digested_Mass400to6000.txt \
> > Cg_Giga_cont_prtc_AA_M400-6000-2c.txt
And then I think we just converted it to a .fasta
For the first time you tried this, that could have been an issue since I had originally used a different fasta for pecan. But since we are re-making the blib file now, it shouldn't matter. I've already found a difference in a setting between your Skyline document and the suggested settings in the MS1 extraction tutorial on the Skyline website. I'm going to go through the whole thing and see if I can find anything else.
Thank you Emma!
I haven't done a complete comparison of your Skyline document and the tutorial, instead I made a new one! Maybe not the most efficient thing to do, but it is done. I followed 2 tutorials: Data Independent Acquisition and iRT Retention Time Prediction. The latter was to use the PRTC peptides to do a better job of coordination peak IDs across replicates. I think it did what it was supposed to do, but since this is the first time I have used iRT I'm going to run it by Nick to make sure. You can see what I did in my Evernote entry. The new Skyline document is here. I'll let you know what Nick says about my skills implementing iRT. If this is enough of an improvement to move forward with the entire dataset, great. If it's not then a couple ways to move forward would be to choose protein-based or pathway-based analysis. This would serve to narrow down your target list to targets of specific interest and then manually curate the dataset to make sure your peaks are well chosen. This would then allow you to do a more accurate comparison of protein abundance than pure spectral counting.
Thank you so much! I'll look at this today and Thursday
feels like @emmats has tackled this.
@emmats did Nick get back to you on your iRT implementation?
He said I did it correctly.
Notebook post detailing what I did in Walnut and changes I made to settings: 2018-07-27-Skyline-DIA.md
Nick from Skyline recommended trying the Advanced Peak Picking Model for Step 5b: Spot-checking Peptides. DIA instructions are not at all easy to follow - need help.
Latest Skyline document: http://owl.fish.washington.edu/scaphapoda/grace/2015-oysterseed-project/20180727-2015-Cgseed.sky.zip
Screenshots of some "peaks" in the document: