Closed sr320 closed 7 years ago
The manual for xinteract can be viewed by running xinteract w/o any commands:
/usr/tpp_install/tpp/bin/xinteract
PeptideProphet options [following the '-O']:
-OAp
- A [use accurate mass binning in PeptideProphet], p [run ProteinProphet afterwards]
As Emma might not believe - I am not that incompetent :) I am looking for layperson explanation from Emma..
On Fri, Feb 10, 2017 at 7:40 AM kubu4 notifications@github.com wrote:
The manual for xinteract can be viewed by running xinteract w/o any commands:
/usr/tpp_install/tpp/bin/xinteract
PeptideProphet options [following the '-O']: -OAp - A [use accurate mass binning in PeptideProphet], p [run ProteinProphet afterwards]
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/sr320/LabDocs/issues/480#issuecomment-278977656, or mute the thread https://github.com/notifications/unsubscribe-auth/AEPHt1coVkedrWNQrPdhVy_aZBHhC8ATks5rbITIgaJpZM4L9fRE .
Sorry, hadn't noticed she was assigned to this.
What is a decoy database? A decoy database is used to calculate the error rate of your search matches. In your Comet parameter file, you can select a decoy search (see below). This will search your peptide spectra against the reverse sequences of your database to see if peptides match to a "non-sequence".
decoy_search = 0 # 0=no (default), 1=concatenated search, 2=separate search
In the command below I get WARNING: No decoys with label DECOY_ were found in this dataset. reverting to fully unsupervised method. Should I be getting this warning? It sounds like the searches were run without a decoy database. It is usually better to use a decoy search. However, I have run searches without a decoy followed by TPP and I don't remember seeing that error. But that doesn't mean it didn't happen :)
2b) What does -dDECOY_ do? see above What does -OAp mean? It runs protein prophet after peptide prophet I get 80,000 lines of WARNING. Is this normal? I don't think so.
So I ran the same file, pulled from Steven's directory on Emu 20161205_Sample_1.raw
and ran it through the ReAdW, Comet, and xInteract steps, with the only difference being the reference genome used (Stevens = database.CgCont.fa, mine = Uniprot gigas proteome.
Stevens reference results from xInteract:
Uniprot reference results from xInteract:
The database.CgCont looks like:
While the Uniprot database looks like:
There's a notebook for this here but it doesn't have any output, as the 80k+ errors makes it too large.
Thanks - The 80k WARININGS are related to the fasta header.
see https://github.com/sr320/nb-2017/blob/master/C_gigas/00-Protein-database.ipynb
tldc:
once cleaned - new PP out:
steven@emu:~/bioinfo/021017$ /usr/tpp_install/tpp/bin/xinteract \
> -dDECOY_ \
> -N20161205_Sample_1 \
> 20161205_Sample_1.pep.xml \
> -p0.9 \
> -OAp
/usr/tpp_install/tpp/bin/xinteract (TPP v5.0.0 Typhoon, Build 201612091438-exported (Linux-x86_64))
naming output file interact-20161205_Sample_1.pep.xml
running: "/usr/tpp_install/tpp/bin/InteractParser 'interact-20161205_Sample_1.pep.xml' '20161205_Sample_1.pep.xml' -L'7'"
file 1: 20161205_Sample_1.pep.xml
SUCCESS: CORRECTED data file /home/steven/bioinfo/021017/20161205_Sample_1.mzXML in msms_run_summary tag ...
SUCCESS: CORRECTED data file /home/steven/bioinfo/021017/20161205_Sample_1.mzXML in msms_run_summary tag ...
processed altogether 89352 results
INFO: Results written to file: /home/steven/bioinfo/021017/interact-20161205_Sample_1.pep.xml
command completed in 25 sec
running: "/usr/tpp_install/tpp/bin/DatabaseParser 'interact-20161205_Sample_1.pep.xml'"
command completed in 0 sec
running: "/usr/tpp_install/tpp/bin/RefreshParser 'interact-20161205_Sample_1.pep.xml' 'Cg_Giga_cont_AA.fa'"
- Building Commentz-Walter keyword tree...
- Searching the tree...
- Linking duplicate entries...
- Printing results...
command completed in 11 sec
running: "/usr/tpp_install/tpp/bin/PeptideProphetParser 'interact-20161205_Sample_1.pep.xml' DECOY=DECOY_ MINPROB=0.9 ACCMASS"
using Accurate Mass Bins
Using Decoy Label "DECOY_".
(Comet)
adding ACCMASS mixture distribution
init with Comet trypsin
MS Instrument info: Manufacturer: Thermo, Model: Orbitrap Fusion Lumos, Ionization: UNKNOWN, Analyzer: UNKNOWN, Detector: UNKNOWN
INFO: Processing standard MixtureModel ...
PeptideProphet (TPP v5.0.0 Typhoon, Build 201612091438-exported (Linux-x86_64)) AKeller@ISB
read in 0 1+, 38769 2+, 36550 3+, 9547 4+, 1969 5+, 0 6+, and 0 7+ spectra.
Initialising statistical models ...
Found 0 Decoys, and 86835 Non-Decoys
WARNING: No decoys with label DECOY_ were found in this dataset. reverting to fully unsupervised method.
Iterations: .........10.........20.........30..
model complete after 33 iterations
command completed in 69 sec
running: "/usr/tpp_install/tpp/bin/ProphetModels.pl -i interact-20161205_Sample_1.pep.xml -d "DECOY_""
Analyzing interact-20161205_Sample_1.pep.xml ...
Reading Accurate Mass Model model +1 ...
Reading Accurate Mass Model model +2 ...
Reading Accurate Mass Model model +3 ...
Reading Accurate Mass Model model +4 ...
Reading Accurate Mass Model model +5 ...
Reading Accurate Mass Model model +6 ...
Reading Accurate Mass Model model +7 ...
Parsing search results "/home/steven/bioinfo/021017/20161205_Sample_1 (Comet)"...
=> Found 38893 hits. (0 decoys, 0 excluded)
=> Total so far: 38893 hits. (0 decoys, 0 excluded)
command completed in 4 sec
running: "/usr/tpp_install/tpp/cgi-bin/PepXMLViewer.cgi -I /home/steven/bioinfo/021017/interact-20161205_Sample_1.pep.xml"
command completed in 4 sec
running: "/usr/tpp_install/tpp/bin/ProteinProphet 'interact-20161205_Sample_1.pep.xml' 'interact-20161205_Sample_1.prot.xml'"
ProteinProphet (C++) by Insilicos LLC and LabKey Software, after the original Perl by A. Keller (TPP v5.0.0 Typhoon, Build 201612091438-exported (Linux-x86_64))
(no FPKM) (using degen pep info)
Reading in /home/steven/bioinfo/021017/interact-20161205_Sample_1.pep.xml...
...read in 0 1+, 20144 2+, 15223 3+, 3203 4+, 323 5+, 0 6+, 0 7+ spectra with min prob 0.05
Initializing 29789 peptide weights: 0%...10%...20%...30%...40%...50%...60%...70%...80%...90%...100%
Calculating protein lengths and molecular weights from database Cg_Giga_cont_AA.fa
.........:.........:.........:.........:.........:.........:.........:.........:.........:.........1000
.........:.........:.........:.........:.........:.........:.........:.........:.........:.........2000
.........:.........:.........:.........:.........:.........:.........:.........:.........:.........3000
.........:.........:.........:.........:.........:.........:.........:.........:.........:.........4000
.........:.........:.........:.........:.........:.........:.........:.........:.........:.........5000
.........:.........:.........:.........:.........:.........:.........:.........:.........:.........6000
.........:.........:.........:.........:.........:.........:.........:.........:.........:.........7000
.........:.........:.........:.........:.........:.........:.........:.........:.........:.........8000
.........:.........:.........:.........:.........:.........:.........:.........:.........:.........9000
.........:.........:.........:.........:.........:.........:.........:.........:.........:.........10000
.........:.........:.........:.........:.........:.........:.........:.........:.........:.........11000
.........:.........:.........:.........:.........:.........:.........:.........:.........:.........12000
.........:.........:.........:.........:.........:.........:.........:.........:.........:.........13000
.........:.........:.........:.........:.........:.........:.........:.........:.........:.........14000
.........:.........:.........:.........:.........:.........:.........:.........:.........:.........15000
.........:.........:.........:.........:.........:.........:.........:.........:.........:.........16000
.........:.........:.........:.........:.........:.........:.........:.........:.........:.........17000
.........:.........:.........:.........:.........:.........:.........:.........:.........:.........18000
.........:.........:.........:.........:.........:.........:.........:.........:.........:.........19000
.........:.........:.........:.........:.........:.........:.........:.........:.........:.........20000
.........:.........:.........:.........:.........:.........:.........:.........:.........:.........21000
.........:.........:.........:.........:.........:.........:.........:.........:.........:.........22000
.........:.........:.........:.........:.........:.........:.........:.........:.........:.........23000
.........:.........:.........:.........:.........:.........:.........:.........:.........:.........24000
.........:.........:.........:.........:.........:.........:.........:.........:.........:.........25000
.........:.........:.........:.........:.........:.........:.........:.........:.........:.........26000
.........:.........:.........:.........:.........:.........:.........:.........:.........:.........27000
.........:.........:.........:.........:.........:.........:.........:.........:.........:.........28000
.........:.........:.........:.........:.........:.........:.........:.........:.........:.........29000
.........:.........:.........:.........:.........:.........:.........:.........:.........:.........30000
.........:.........:.........:.........:.........:.........:.........:.........:.........:.........31000
.........:.........:.........:.........:.........:.........:.........:.........:.........:.........32000
.........:.........:.........:.........:.........:.........:.........:.........:.........:.........33000
.........:.........:.........:.........:.........:.........:.........:.........:.........:.........34000
.........:.........:.........:.........:.........:.........:.........:.........:.........:.........35000
.........:.........:.........:.........:.........:.........:.........:.........:.........:.........36000
.........:.........:.........:.........:.........:.........:.........:.........:.........:.........37000
.........:.........:.........:.........:.........:.........:.........:.........:.........:.........38000
.........:.........:.........:.........:.........:.........:.........:.........:.........:.........39000
.........:.........:.........:.........:.........:.........:.........:.........:.........:.........40000
.........:.........:.........:.........:.........:.........:.........:..... Total: 40751
Computing degenerate peptides for 9769 proteins: 0%...10%...20%...30%...40%...50%...60%...70%...80%...90%...100%
Computing probabilities for 11063 proteins. Loop 1: 0%...20%...40%...60%...80%...100% Loop 2: 0%...20%...40%...60%...80%...100%
Computing probabilities for 11063 proteins. Loop 1: 0%...20%...40%...60%...80%...100% Loop 2: 0%...20%...40%...60%...80%...100%
Computing probabilities for 11063 proteins. Loop 1: 0%...20%...40%...60%...80%...100% Loop 2: 0%...20%...40%...60%...80%...100%
Computing probabilities for 11063 proteins. Loop 1: 0%...20%...40%...60%...80%...100% Loop 2: 0%...20%...40%...60%...80%...100%
Computing probabilities for 11063 proteins. Loop 1: 0%...20%...40%...60%...80%...100% Loop 2: 0%...20%...40%...60%...80%...100%
Computing probabilities for 11063 proteins. Loop 1: 0%...20%...40%...60%...80%...100% Loop 2: 0%...20%...40%...60%...80%...100%
Computing 7196 protein groups: 0%...10%...20%...30%...40%...50%...60%...70%...80%...90%...100%
Calculating sensitivity...and error tables...
Computing MU for 11063 proteins: 0%...10%...20%...30%...40%...50%...60%...70%...80%...90%...100%
INFO: mu=5.99794e-06, db_size=40397625
Finished. Results written to: /home/steven/bioinfo/021017/interact-20161205_Sample_1.prot.xml
command completed in 79 sec
running: "/usr/tpp_install/tpp/bin/ProtProphModels.pl -i interact-20161205_Sample_1.prot.xml"
Analyzing interact-20161205_Sample_1.prot.xml ...
command completed in 1 sec
running: "/usr/tpp_install/tpp/bin/tpp_models.pl '/home/steven/bioinfo/021017/interact-20161205_Sample_1.pep.xml'"
File: /home/steven/bioinfo/021017/interact-20161205_Sample_1.pep.xml
- in ms run: /home/steven/bioinfo/021017/20161205_Sample_1...
-------------------------------------------------------------------------------
TPP DASHBOARD -- started at Fri Feb 10 11:41:59 2017
-------------------------------------------------------------------------------
File /home/steven/bioinfo/021017/interact-20161205_Sample_1.pep.xml is pepxml
Found fval (+1) model...
Found ntt (+1) model...
Found nmc (+1) model...
Found AccurateMassModel ('+1') model...
Found IsoMassDiff (+1) model...
Found fval (+2) model...
Found ntt (+2) model...
Found nmc (+2) model...
Found AccurateMassModel ('+2') model...
Found IsoMassDiff (+2) model...
Found fval (+3) model...
Found ntt (+3) model...
Found nmc (+3) model...
Found AccurateMassModel ('+3') model...
Found IsoMassDiff (+3) model...
Found fval (+4) model...
Found ntt (+4) model...
Found nmc (+4) model...
Found AccurateMassModel ('+4') model...
Found IsoMassDiff (+4) model...
Found fval (+5) model...
Found ntt (+5) model...
Found nmc (+5) model...
Found AccurateMassModel ('+5') model...
Found IsoMassDiff (+5) model...
Found fval (+6) model...
Found ntt (+6) model...
Found nmc (+6) model...
Found AccurateMassModel ('+6') model...
Found IsoMassDiff (+6) model...
Found fval (+7) model...
Found ntt (+7) model...
Found nmc (+7) model...
Found AccurateMassModel ('+7') model...
Found IsoMassDiff (+7) model...
--> Trying to write file /home/steven/bioinfo/021017/interact-20161205_Sample_1.pep-MODELS.html
-------------------------------------------------------------------------------
Finished at Fri Feb 10 11:42:04 2017 with 0 errors.
-------------------------------------------------------------------------------
command completed in 5 sec
running: "/usr/tpp_install/tpp/bin/tpp_models.pl '/home/steven/bioinfo/021017/interact-20161205_Sample_1.prot.xml'"
File: /home/steven/bioinfo/021017/interact-20161205_Sample_1.prot.xml
-------------------------------------------------------------------------------
TPP DASHBOARD -- started at Fri Feb 10 11:42:04 2017
-------------------------------------------------------------------------------
File /home/steven/bioinfo/021017/interact-20161205_Sample_1.prot.xml is protxml
Found end of header
--> Trying to write file /home/steven/bioinfo/021017/interact-20161205_Sample_1.prot-MODELS.html
-------------------------------------------------------------------------------
Finished at Fri Feb 10 11:42:04 2017 with 0 errors.
-------------------------------------------------------------------------------
command completed in 0 sec
/usr/tpp_install/tpp/bin/InteractParser 'interact-20161205_Sample_1.pep.xml' '20161205_Sample_1.pep.xml' -L'7' 25 sec
/usr/tpp_install/tpp/bin/DatabaseParser 'interact-20161205_Sample_1.pep.xml'
/usr/tpp_install/tpp/bin/RefreshParser 'interact-20161205_Sample_1.pep.xml' 'Cg_Giga_cont_AA.fa' 11 sec
/usr/tpp_install/tpp/bin/PeptideProphetParser 'interact-20161205_Sample_1.pep.xml' DECOY=DECOY_ MINPROB=0.9 ACCMASS 69 sec
/usr/tpp_install/tpp/bin/ProphetModels.pl -i interact-20161205_Sample_1.pep.xml -d "DECOY_" 4 sec
/usr/tpp_install/tpp/cgi-bin/PepXMLViewer.cgi -I /home/steven/bioinfo/021017/interact-20161205_Sample_1.pep.xml 4 sec
/usr/tpp_install/tpp/bin/ProteinProphet 'interact-20161205_Sample_1.pep.xml' 'interact-20161205_Sample_1.prot.xml' 79 sec
/usr/tpp_install/tpp/bin/ProtProphModels.pl -i interact-20161205_Sample_1.prot.xml 1 sec
/usr/tpp_install/tpp/bin/tpp_models.pl '/home/steven/bioinfo/021017/interact-20161205_Sample_1.pep.xml' 5 sec
/usr/tpp_install/tpp/bin/tpp_models.pl '/home/steven/bioinfo/021017/interact-20161205_Sample_1.prot.xml' 0 sec
job completed in 198 sec
But should I decoy? or not?
you should decoy
do you recommend 1=concatenated search, 2=separate search ?
1=concatenated search
Done... Changed parameters file accordingly.
but still get
Found 0 Decoys, and 86835 Non-Decoys
WARNING: No decoys with label DECOY_ were found in this dataset. reverting to fully unsupervised method.
as a reminder this is the code I am using
steven@emu:~/bioinfo/020917$ /usr/tpp_install/tpp/bin/xinteract \
> -dDECOY_ \
> -N20161205_Sample_1 \
> 20161205_Sample_1.pep.xml \
> -p0.9 \
> -OAp
Hold on ... let me try something else....
Here is what I run: xinteract -p0.9 -OAp -dDECOY_ -N2016_Dec_16_Kaho_40_54_QE_26 2016_Dec_16_Kaho_40_54_QE_26.pep.xml
It looks a lot like yours.
Still getting the error- how about your comet parameter file? Can I see that (or all the files in the directory and I can run locally)?
It is all on the GS server and you would have to ssh in with an account set up through them. Although Yaamini and Laura know how to access the files on my account (don't tell!). Here is an example of a parameter file:
# comet_version 2016.01 rev. 2
# Comet MS/MS search engine parameters file.
# Everything following the '#' symbol is treated as a comment.
database_name = /net/gs/vol4/shared/nunnlab/search/emmats/transdecoder/copepod/pleuromamma_all.nr.fasta.f50.nuc.transdecoder.pep
decoy_search = 1 # 0=no (default), 1=concatenated search, 2=separate search
num_threads = 0 # 0=poll CPU to set num threads; else specify num threads directly (max 64)
#
# masses
#
peptide_mass_tolerance = 3.00
peptide_mass_units = 0 # 0=amu, 1=mmu, 2=ppm
mass_type_parent = 1 # 0=average masses, 1=monoisotopic masses
mass_type_fragment = 1 # 0=average masses, 1=monoisotopic masses
precursor_tolerance_type = 0 # 0=MH+ (default), 1=precursor m/z; only valid for amu/mmu tolerances
isotope_error = 0 # 0=off, 1=on -1/0/1/2/3 (standard C13 error), 2= -8/-4/0/4/8 (for +4/+8 labeling)
#
# search enzyme
#
search_enzyme_number = 1 # choose from list at end of this params file
num_enzyme_termini = 2 # 1 (semi-digested), 2 (fully digested, default), 8 C-term unspecific , 9 N-term unspecific
allowed_missed_cleavage = 2 # maximum value is 5; for enzyme search
#
# Up to 9 variable modifications are supported
# format: <mass> <residues> <0=variable/else binary> <max_mods_per_peptide> <term_distance> <n/c-term> <required>
# e.g. 79.966331 STY 0 3 -1 0 0
#
variable_mod01 = 15.9949 M 0 3 -1 0 0
variable_mod02 = 0.0 X 0 3 -1 0 0
variable_mod03 = 0.0 X 0 3 -1 0 0
variable_mod04 = 0.0 X 0 3 -1 0 0
variable_mod05 = 0.0 X 0 3 -1 0 0
variable_mod06 = 0.0 X 0 3 -1 0 0
variable_mod07 = 0.0 X 0 3 -1 0 0
variable_mod08 = 0.0 X 0 3 -1 0 0
variable_mod09 = 0.0 X 0 3 -1 0 0
max_variable_mods_in_peptide = 5
require_variable_mod = 0
#
# fragment ions
#
# ion trap ms/ms: 1.0005 tolerance, 0.4 offset (mono masses), theoretical_fragment_ions = 1
# high res ms/ms: 0.02 tolerance, 0.0 offset (mono masses), theoretical_fragment_ions = 0
#
fragment_bin_tol = 1.0005 # binning to use on fragment ions
fragment_bin_offset = 0.4 # offset position to start the binning (0.0 to 1.0)
theoretical_fragment_ions = 1 # 0=use flanking peaks, 1=M peak only
use_A_ions = 0
use_B_ions = 1
use_C_ions = 0
use_X_ions = 0
use_Y_ions = 1
use_Z_ions = 0
use_NL_ions = 0 # 0=no, 1=yes to consider NH3/H2O neutral loss peaks
#
# output
#
output_sqtstream = 0 # 0=no, 1=yes write sqt to standard output
output_sqtfile = 0 # 0=no, 1=yes write sqt file
output_txtfile = 0 # 0=no, 1=yes write tab-delimited txt file
output_pepxmlfile = 1 # 0=no, 1=yes write pep.xml file
output_percolatorfile = 0 # 0=no, 1=yes write Percolator tab-delimited input file
output_outfiles = 0 # 0=no, 1=yes write .out files
print_expect_score = 1 # 0=no, 1=yes to replace Sp with expect in out & sqt
num_output_lines = 5 # num peptide results to show
show_fragment_ions = 0 # 0=no, 1=yes for out files only
sample_enzyme_number = 1 # Sample enzyme which is possibly different than the one applied to the search.
# Used to calculate NTT & NMC in pepXML output (default=1 for trypsin).
#
# mzXML parameters
#
scan_range = 0 0 # start and scan scan range to search; 0 as 1st entry ignores parameter
precursor_charge = 0 0 # precursor charge range to analyze; does not override any existing charge; 0 as 1st entry ignores parameter
override_charge = 0 # 0=no, 1=override precursor charge states, 2=ignore precursor charges outside precursor_charge range, 3=see online
ms_level = 2 # MS level to analyze, valid are levels 2 (default) or 3
activation_method = ALL # activation method; used if activation method set; allowed ALL, CID, ECD, ETD, PQD, HCD, IRMPD
#
# misc parameters
#
digest_mass_range = 600.0 5000.0 # MH+ peptide mass range to analyze
num_results = 100 # number of search hits to store internally
skip_researching = 1 # for '.out' file output only, 0=search everything again (default), 1=don't search if .out exists
max_fragment_charge = 3 # set maximum fragment charge state to analyze (allowed max 5)
max_precursor_charge = 6 # set maximum precursor charge state to analyze (allowed max 9)
nucleotide_reading_frame = 0 # 0=proteinDB, 1-6, 7=forward three, 8=reverse three, 9=all six
clip_nterm_methionine = 0 # 0=leave sequences as-is; 1=also consider sequence w/o N-term methionine
spectrum_batch_size = 0 # max. # of spectra to search at a time; 0 to search the entire scan range in one loop
decoy_prefix = DECOY_ # decoy entries are denoted by this string which is pre-pended to each protein accession
output_suffix = # add a suffix to output base names i.e. suffix "-C" generates base-C.pep.xml from base.mzXML input
mass_offsets = # one or more mass offsets to search (values substracted from deconvoluted precursor mass)
#
# spectral processing
#
minimum_peaks = 10 # required minimum number of peaks in spectrum to search (default 10)
minimum_intensity = 0 # minimum intensity value to read in
remove_precursor_peak = 0 # 0=no, 1=yes, 2=all charge reduced precursor peaks (for ETD)
remove_precursor_tolerance = 1.5 # +- Da tolerance for precursor removal
clear_mz_range = 0.0 0.0 # for iTRAQ/TMT type data; will clear out all peaks in the specified m/z range
#
# additional modifications
#
add_Cterm_peptide = 0.0
add_Nterm_peptide = 0.0
add_Cterm_protein = 0.0
add_Nterm_protein = 0.0
add_G_glycine = 0.0000 # added to G - avg. 57.0513, mono. 57.02146
add_A_alanine = 0.0000 # added to A - avg. 71.0779, mono. 71.03711
add_S_serine = 0.0000 # added to S - avg. 87.0773, mono. 87.03203
add_P_proline = 0.0000 # added to P - avg. 97.1152, mono. 97.05276
add_V_valine = 0.0000 # added to V - avg. 99.1311, mono. 99.06841
add_T_threonine = 0.0000 # added to T - avg. 101.1038, mono. 101.04768
add_C_cysteine = 57.021464 # added to C - avg. 103.1429, mono. 103.00918
add_L_leucine = 0.0000 # added to L - avg. 113.1576, mono. 113.08406
add_I_isoleucine = 0.0000 # added to I - avg. 113.1576, mono. 113.08406
add_N_asparagine = 0.0000 # added to N - avg. 114.1026, mono. 114.04293
add_D_aspartic_acid = 0.0000 # added to D - avg. 115.0874, mono. 115.02694
add_Q_glutamine = 0.0000 # added to Q - avg. 128.1292, mono. 128.05858
add_K_lysine = 0.0000 # added to K - avg. 128.1723, mono. 128.09496
add_E_glutamic_acid = 0.0000 # added to E - avg. 129.1140, mono. 129.04259
add_M_methionine = 0.0000 # added to M - avg. 131.1961, mono. 131.04048
add_O_ornithine = 0.0000 # added to O - avg. 132.1610, mono 132.08988
add_H_histidine = 0.0000 # added to H - avg. 137.1393, mono. 137.05891
add_F_phenylalanine = 0.0000 # added to F - avg. 147.1739, mono. 147.06841
add_U_selenocysteine = 0.0000 # added to U - avg. 150.3079, mono. 150.95363
add_R_arginine = 0.0000 # added to R - avg. 156.1857, mono. 156.10111
add_Y_tyrosine = 0.0000 # added to Y - avg. 163.0633, mono. 163.06333
add_W_tryptophan = 0.0000 # added to W - avg. 186.0793, mono. 186.07931
add_B_user_amino_acid = 0.0000 # added to B - avg. 0.0000, mono. 0.00000
add_J_user_amino_acid = 0.0000 # added to J - avg. 0.0000, mono. 0.00000
add_X_user_amino_acid = 0.0000 # added to X - avg. 0.0000, mono. 0.00000
add_Z_user_amino_acid = 0.0000 # added to Z - avg. 0.0000, mono. 0.00000
#
# COMET_ENZYME_INFO _must_ be at the end of this parameters file
#
[COMET_ENZYME_INFO]
0. No_enzyme 0 - -
1. Trypsin 1 KR P
2. Trypsin/P 1 KR -
3. Lys_C 1 K P
4. Lys_N 0 K -
5. Arg_C 1 R P
6. Asp_N 0 D -
7. CNBr 1 M -
8. Glu_C 1 DE P
9. PepsinA 1 FL P
10. Chymotrypsin 1 FWYL P
As it stands I get
INFO: Processing standard MixtureModel ...
PeptideProphet (TPP v5.0.0 Typhoon, Build 201612091438-exported (Linux-x86_64)) AKeller@ISB
read in 0 1+, 38769 2+, 36550 3+, 9547 4+, 1969 5+, 0 6+, and 0 7+ spectra.
Initialising statistical models ...
Found 0 Decoys, and 86835 Non-Decoys
WARNING: No decoys with label DECOY_ were found in this dataset. reverting to fully unsupervised method.
which kind of makes since as I am telling it decoy proteins are tagged with DECOY_
. and I have not done this....
-d<tag> [use decoy hits to pin down the negative distribution.
the decoy protein names must begin with <tag> (whitespace is not allowed)]
Comet automatically tags your decoy proteins with 'DECOY'
I would argue it is not :)
If it was why is xinteract finding 0 decoys ?
Because your data are perfect.
WAIT!! maybe I need to rerun Comet?
You knew this the whole time, just using this as a teaching moment?
Well we had already gone over that decoys are only detected if you run a decoy search. I thought you were paying attention since you claimed competence.
I've also been sitting here, running TPP correctly, for the last 20 minutes of this "conversation".
Found 20718 Decoys, and 66210 Non-Decoys
definitely. And I think you can still make your ferry.
A few questions re PP.
1) What is a decoy database? In the command below I get
WARNING: No decoys with label DECOY_ were found in this dataset. reverting to fully unsupervised method.
2) Should I be getting this warning? 2b) What does-dDECOY_
do? 3) What does-OAp
mean? 4) I get 80,000 lines of WARNING. Is this normal?Thanks!
There are about 80,000 lines with this WARNING