Closed videlc closed 7 years ago
Hi @videlc, I'll forward your question to the person responsible for Philosopher development. In the meantime, could you please post more output from before this last stage where it fails?
Most likely the problem is with annotations in the FASTA file. Could you put the database you used somewhere online and send me a link? Or maybe just attach a zipped copy to a response here (file size permitting).
Hi @chhh, Thank you for your answer.
Here is the complete MSFragger GUI log :
Executing command:
$> java -jar -Xmx8G C:\Users\delv1901\Documents\MSFragger_20170103\MSFragger.jar C:\Users\delv1901\Documents\Data\20160704_altmid_mid_gfp\fragger.params C:\Users\delv1901\Documents\Data\20160704_altmid_mid_gfp\276vivian_ALG.mgf
Process started
Peptide index read in 891ms
Selected fragment tolerance 0,02 Da and maximum fragment slice size of 4955,80MB
416196452 fragments to be searched in 1 slices (3,10GB total)
Operating on slice 1 of 1:
13735ms
276vivian_ALG.mgf
4953ms
276vivian_ALG.mgf 4953ms [progress: 3593/104648 (3,43%) - 714,17 spectra/s]
276vivian_ALG.mgf 4953ms [progress: 6021/104648 (5,75%) - 484,05 spectra/s]
276vivian_ALG.mgf 4953ms [progress: 8611/104648 (8,23%) - 508,44 spectra/s]
276vivian_ALG.mgf 4953ms [progress: 11002/104648 (10,51%) - 473,75 spectra/s]
276vivian_ALG.mgf 4953ms [progress: 13279/104648 (12,69%) - 445,60 spectra/s]
276vivian_ALG.mgf 4953ms [progress: 15460/104648 (14,77%) - 434,81 spectra/s]
276vivian_ALG.mgf 4953ms [progress: 17371/104648 (16,60%) - 375,15 spectra/s]
276vivian_ALG.mgf 4953ms [progress: 19308/104648 (18,45%) - 386,16 spectra/s]
276vivian_ALG.mgf 4953ms [progress: 21142/104648 (20,20%) - 362,24 spectra/s]
276vivian_ALG.mgf 4953ms [progress: 23001/104648 (21,98%) - 367,25 spectra/s]
276vivian_ALG.mgf 4953ms [progress: 24739/104648 (23,64%) - 341,19 spectra/s]
276vivian_ALG.mgf 4953ms [progress: 26533/104648 (25,35%) - 358,73 spectra/s]
276vivian_ALG.mgf 4953ms [progress: 27997/104648 (26,75%) - 291,87 spectra/s]
276vivian_ALG.mgf 4953ms [progress: 29803/104648 (28,48%) - 350,20 spectra/s]
276vivian_ALG.mgf 4953ms [progress: 31578/104648 (30,18%) - 351,69 spectra/s]
276vivian_ALG.mgf 4953ms [progress: 33218/104648 (31,74%) - 321,00 spectra/s]
276vivian_ALG.mgf 4953ms [progress: 34814/104648 (33,27%) - 314,24 spectra/s]
276vivian_ALG.mgf 4953ms [progress: 36350/104648 (34,74%) - 298,77 spectra/s]
276vivian_ALG.mgf 4953ms [progress: 37927/104648 (36,24%) - 310,56 spectra/s]
276vivian_ALG.mgf 4953ms [progress: 39361/104648 (37,61%) - 285,03 spectra/s]
276vivian_ALG.mgf 4953ms [progress: 40772/104648 (38,96%) - 276,99 spectra/s]
276vivian_ALG.mgf 4953ms [progress: 42393/104648 (40,51%) - 320,17 spectra/s]
276vivian_ALG.mgf 4953ms [progress: 43950/104648 (42,00%) - 307,53 spectra/s]
276vivian_ALG.mgf 4953ms [progress: 45366/104648 (43,35%) - 278,85 spectra/s]
276vivian_ALG.mgf 4953ms [progress: 46734/104648 (44,66%) - 272,73 spectra/s]
276vivian_ALG.mgf 4953ms [progress: 48292/104648 (46,15%) - 304,89 spectra/s]
276vivian_ALG.mgf 4953ms [progress: 49765/104648 (47,55%) - 286,52 spectra/s]
276vivian_ALG.mgf 4953ms [progress: 51288/104648 (49,01%) - 297,17 spectra/s]
276vivian_ALG.mgf 4953ms [progress: 52665/104648 (50,33%) - 268,68 spectra/s]
276vivian_ALG.mgf 4953ms [progress: 54079/104648 (51,68%) - 276,71 spectra/s]
276vivian_ALG.mgf 4953ms [progress: 55556/104648 (53,09%) - 289,10 spectra/s]
276vivian_ALG.mgf 4953ms [progress: 57055/104648 (54,52%) - 296,07 spectra/s]
276vivian_ALG.mgf 4953ms [progress: 58582/104648 (55,98%) - 304,43 spectra/s]
276vivian_ALG.mgf 4953ms [progress: 59955/104648 (57,29%) - 272,04 spectra/s]
276vivian_ALG.mgf 4953ms [progress: 61282/104648 (58,56%) - 263,71 spectra/s]
276vivian_ALG.mgf 4953ms [progress: 62773/104648 (59,98%) - 297,25 spectra/s]
276vivian_ALG.mgf 4953ms [progress: 64262/104648 (61,41%) - 295,03 spectra/s]
276vivian_ALG.mgf 4953ms [progress: 65851/104648 (62,93%) - 311,94 spectra/s]
276vivian_ALG.mgf 4953ms [progress: 67515/104648 (64,52%) - 331,74 spectra/s]
276vivian_ALG.mgf 4953ms [progress: 69200/104648 (66,13%) - 332,87 spectra/s]
276vivian_ALG.mgf 4953ms [progress: 70872/104648 (67,72%) - 332,27 spectra/s]
276vivian_ALG.mgf 4953ms [progress: 72778/104648 (69,55%) - 375,34 spectra/s]
276vivian_ALG.mgf 4953ms [progress: 74773/104648 (71,45%) - 389,27 spectra/s]
276vivian_ALG.mgf 4953ms [progress: 76672/104648 (73,27%) - 375,07 spectra/s]
276vivian_ALG.mgf 4953ms [progress: 78745/104648 (75,25%) - 409,44 spectra/s]
276vivian_ALG.mgf 4953ms [progress: 80945/104648 (77,35%) - 437,29 spectra/s]
276vivian_ALG.mgf 4953ms [progress: 83154/104648 (79,46%) - 436,30 spectra/s]
276vivian_ALG.mgf 4953ms [progress: 85962/104648 (82,14%) - 556,37 spectra/s]
276vivian_ALG.mgf 4953ms [progress: 88942/104648 (84,99%) - 586,73 spectra/s]
276vivian_ALG.mgf 4953ms [progress: 92779/104648 (88,66%) - 748,68 spectra/s]
276vivian_ALG.mgf 4953ms [progress: 98459/104648 (94,09%) - 1132,38 spectra/s]
276vivian_ALG.mgf 4953ms [progress: 103319/104648 (98,73%) - 957,07 spectra/s]
276vivian_ALG.mgf 4953ms [progress: 104648/104648 (100,00%) - 366,62 spectra/s]
- completed 267357ms
Process finished, exit value: 0
Executing command:
$> C:\Users\delv1901\Documents\MSFragger-GUI_v2.6\philosopher_windows_amd64.exe workspace --init
Process started
INFO[09:08:36] Creating workspace
WARN[09:08:36] existing workspace detected, will not overwrite
INFO[09:08:36] Done
Process finished, exit value: 0
Executing command:
$> C:\Users\delv1901\Documents\MSFragger-GUI_v2.6\philosopher_windows_amd64.exe peptideprophet --nonparam --expectscore --decoy rev --decoyprobs --masswidth 1000.0 --clevel -2 --database C:\Users\delv1901\Documents\FASTA\uniprot_hs_03_2017_GST_reverse_decoy.fasta C:\Users\delv1901\Documents\Data\20160704_altmid_mid_gfp\276vivian_ALG.pepXML
Process started
file 1: C:\Users\delv1901\Documents\Data\20160704_altmid_mid_gfp\276vivian_ALG.pepXML
processed altogether 22781 results
INFO: Results written to file: C:\Users\delv1901\Documents\Data\20160704_altmid_mid_gfp\interact-276vivian_ALG.pep.xml
- C:\Users\delv1901\Documents\Data\20160704_altmid_mid_gfp\interact-276vivian_ALG.pep.xml
- Building Commentz-Walter keyword tree...
- Searching the tree...
- Linking duplicate entries...
- Printing results...
Using Decoy Label "rev".
Decoy Probabilities will be reported.
Using non-parametric distributions
(X! Tandem) (using Tandem's expectation score for modeling)
init with X! Tandem trypsin
PeptideProphet (TPP v5.0.1 Post-Typhoon dev, Build 201705191533-7588 (Windows_NT-x86_64)) AKeller@ISB
read in 0 1+, 10110 2+, 9045 3+, 2620 4+, 564 5+, 229 6+, and 0 7+ spectra.
Found 0 Decoys, and 22568 Non-Decoys
WARNING: No decoys with label rev were found in this dataset. reverting to fully unsupervised method.
negmean = 0.0533258
MS Instrument info: Manufacturer: UNKNOWN, Model: UNKNOWN, Ionization: UNKNOWN, Analyzer: UNKNOWN, Detector: UNKNOWN
INFO: Processing standard MixtureModel ...
Initialising statistical models ...
INFO[09:08:54] Done
Process finished, exit value: 0
Executing command:
$> C:\Users\delv1901\Documents\MSFragger-GUI_v2.6\philosopher_windows_amd64.exe workspace --clean
Process started
INFO[09:08:54] Removing workspace
WARN[09:08:54] cannot remove the meta data: remove .meta\meta.bin: The process cannot access the file because it is being used by another process.
INFO[09:08:54] Done
Process finished, exit value: 0
Executing command:
$> C:\Users\delv1901\Documents\MSFragger-GUI_v2.6\philosopher_windows_amd64.exe workspace --init
Process started
INFO[09:08:54] Creating workspace
WARN[09:08:54] existing workspace detected, will not overwrite
INFO[09:08:54] Done
Process finished, exit value: 0
Executing command:
$> C:\Users\delv1901\Documents\MSFragger-GUI_v2.6\philosopher_windows_amd64.exe proteinprophet --output interact --maxppmdiff 20.0 interact-276vivian_ALG.pep.xml
Process started
ProteinProphet (C++) by Insilicos LLC and LabKey Software, after the original Perl by A. Keller (TPP v5.0.1 Post-Typhoon dev, Build 201705191533-7588 (Windows_NT-x86_64))
(no FPKM) (using degen pep info)
Reading in C:/Users/delv1901/Documents/Data/20160704_altmid_mid_gfp/interact-276vivian_ALG.pep.xml...
did not find any PeptideProphet results in input data! Did you forget to run PeptideProphet?
...read in 0 1+, 0 2+, 0 3+, 0 4+, 0 5+, 0 6+, 0 7+ spectra with min prob 0
WARNING: no data - output file will be empty
INFO[09:08:58] Done
Process finished, exit value: 0
Executing command:
$> C:\Users\delv1901\Documents\MSFragger-GUI_v2.6\philosopher_windows_amd64.exe workspace --clean
Process started
INFO[09:08:58] Removing workspace
WARN[09:08:58] cannot remove the meta data: remove .meta\meta.bin: The process cannot access the file because it is being used by another process.
INFO[09:08:58] Done
Process finished, exit value: 0
Executing command:
$> C:\Users\delv1901\Documents\MSFragger-GUI_v2.6\philosopher_windows_amd64.exe workspace --init
Process started
INFO[09:08:58] Creating workspace
WARN[09:08:58] existing workspace detected, will not overwrite
INFO[09:08:58] Done
Process finished, exit value: 0
Executing command:
$> C:\Users\delv1901\Documents\MSFragger-GUI_v2.6\philosopher_windows_amd64.exe database --annotate C:\Users\delv1901\Documents\FASTA\uniprot_hs_03_2017_GST_reverse_decoy.fasta
Process started
INFO[09:08:58] Processing database
INFO[09:09:30] Done
Process finished, exit value: 0
Executing command:
$> C:\Users\delv1901\Documents\MSFragger-GUI_v2.6\philosopher_windows_amd64.exe filter --mapmods --sequential --pepxml C:\Users\delv1901\Documents\Data\20160704_altmid_mid_gfp --protxml C:\Users\delv1901\Documents\Data\20160704_altmid_mid_gfp\interact.prot.xml
Process started
INFO[09:09:30] Processing peptide identification files
INFO[09:09:34] 1+ Charge profile decoy=0 target=0
INFO[09:09:34] 2+ Charge profile decoy=2019 target=18584
INFO[09:09:34] 3+ Charge profile decoy=2386 target=15240
INFO[09:09:34] 4+ Charge profile decoy=815 target=4289
INFO[09:09:34] 5+ Charge profile decoy=188 target=950
INFO[09:09:34] 6+ Charge profile decoy=94 target=348
INFO[09:09:34] Database search results ions=31664 peptides=29682 psms=45464
INFO[09:09:34] Converged to 0.00 % FDR with 45464 PSMs decoy=0 threshold=0 total=45464
INFO[09:09:34] Converged to 0.00 % FDR with 29682 Peptides decoy=0 threshold=0 total=29682
INFO[09:09:35] Converged to 0.00 % FDR with 31664 Ions decoy=0 threshold=0 total=31664
FATA[09:09:35] No Protein groups detected, check your file and try again
Process finished, exit value: 1
FASTA will be sent to you via link.
Best regards, Vivian
Here is the FASTA file https://www.dropbox.com/s/1i4qff22v5f6p3j/uniprot_hs_03_2017_GST_reverse_decoy.fasta?dl=0 .
Thanks for the log and FASTA @videlc!
It looks like PeptideProphet could not find any decoy hits, then it reverted to fully automated mode and failed silently:
PeptideProphet (TPP v5.0.1 Post-Typhoon dev, Build 201705191533-7588 (Windows_NT-x86_64)) AKeller@ISB
read in 0 1+, 10110 2+, 9045 3+, 2620 4+, 564 5+, 229 6+, and 0 7+ spectra.
Found 0 Decoys, and 22568 Non-Decoys
WARNING: No decoys with label rev were found in this dataset. reverting to fully unsupervised method.
...
INFO: Processing standard MixtureModel ...
Initialising statistical models ...
INFO[09:08:54] Done
Process finished, exit value: 0
So ProteinProphet didn't find any PeptideProphet results and didn't do anything:
ProteinProphet (C++) by Insilicos LLC and LabKey Software, after the original Perl by A. Keller (TPP v5.0.1 Post-Typhoon dev, Build 201705191533-7588 (Windows_NT-x86_64))
(no FPKM) (using degen pep info)
Reading in C:/Users/delv1901/Documents/Data/20160704_altmid_mid_gfp/interact-276vivian_ALG.pep.xml...
did not find any PeptideProphet results in input data! Did you forget to run PeptideProphet?
...read in 0 1+, 0 2+, 0 3+, 0 4+, 0 5+, 0 6+, 0 7+ spectra with min prob 0
WARNING: no data - output file will be empty
@videlc would you mind sharing 276vivian_ALG.mgf
as well, so we could look into the issue? It's strange that using open search you got 20k forward hits and zero decoys.
Yes, looks like there might be a problem there. However, the "second search" was able to find decoys, that's why I thought it was OK.
INFO[09:09:34] 1+ Charge profile decoy=0 target=0
INFO[09:09:34] 2+ Charge profile decoy=2019 target=18584
INFO[09:09:34] 3+ Charge profile decoy=2386 target=15240
INFO[09:09:34] 4+ Charge profile decoy=815 target=4289
INFO[09:09:34] 5+ Charge profile decoy=188 target=950
INFO[09:09:34] 6+ Charge profile decoy=94 target=348
MS file link will be sent to you via email. Vivian
@videlc, Felipe (the person developling Philosopher) tells me that in your FASTA file the sequences are marked as reverse "incorrectly".
You have this:
tr|rev_A0A024QYW1|A0A024QYW1_HUMAN Isoform of A6NGB0, Transmembrane protein 191C OS=Homo sapiens GN=DKFZp434N035 PE=4 SV=1
and it "should" be this:
rev_tr|A0A024QYW1|A0A024QYW1_HUMAN Isoform of A6NGB0, Transmembrane protein 191C OS=Homo sapiens GN=DKFZp434N035 PE=4 SV=1
Notice that the rev modifier moved from protein accession to the front of the whole description string.
@videlc What tool did you use to generate the DB with reverse-protein decoys?
Oh, i thought rev should have been placed before accession (I usually saw DECOY or REVERSE) so I thought it was the way to go.
Tool I used it this : https://www.ruhr-uni-bochum.de/mpc/software/DecoyBuilder/index.html.en It generates concatenated target/decoy fastas from target only fastas. Decoy tag is added before the accession (after "|"). I replaced it to "rev_" sothat would meet GUI expectations.
Editing FASTA fixed the issue. With which tool should I have generated the FASTA to avoid this problem ?
@videlc Philosopher has a philospher.exe database ...
command which provides some tools to create those databases and append contaminants.
TPP comes with Perl scripts for decoy database generation. OpenMS also has a tool, but the format will be incompatible with PeptideProphet/ProteinProphet :(
We're working on better support for database generation.
Thank you @chhh for the tips and quick support ! Will continue my MSFragger GUI exploration
Hey,
Gave a try to MSFragger GUI with std configuration with open search + philosopher. Everything is fine until grouping. I am using a concatenated target/decoy from uniprot (canonical + isof : swiss + trembl- 03/2017) and my decoys are flagged with the "rev_" tag. Is there something to change to the FASTA ?
Cheers, Vivian