Closed azhang8 closed 6 years ago
@azhang8 could you please post the whole output of the run?
Is it just looping with the same messages? Unless you unchecked the checkbox on the MSFragger tab, but that point in the execution (the one you've posted) MSFragger has already finished searching. It's peptide prophet that gets stuck for some reason.
Hi,
Here is the complete output of the run. Is there anything we can do to get peptide prophet to work?
Thanks,
Austin
Will execute 13 commands: java -jar C:\Program Files\MSFragger_20170103_v2\MSFragger_20170103\MSFragger.jar C:\Program Files\MSFragger-GUI_v2.6\4234_output\fragger.params C:\Program Files\MSFragger-GUI_v2.6\ID22689_04_E749_4234_031116.mzML
java -cp C:\Program Files\MSFragger-GUI_v2.6\MSFragger-GUI.jar umich.msfragger.util.FileMove C:\Program Files\MSFragger-GUI_v2.6\ID22689_04_E749_4234_031116.pepXML C:\Program Files\MSFragger-GUI_v2.6\4234_output\ID22689_04_E749_4234_031116.pepXML
C:\Program Files\MSFragger-GUI_v2.6\philosopher-source_windows_amd64.exe workspace --init
C:\Program Files\MSFragger-GUI_v2.6\philosopher-source_windows_amd64.exe peptideprophet --nonparam --expectscore --decoy rev --decoyprobs --masswidth 1000.0 --clevel -2 --database C:\Program Files\MSFragger-GUI_v2.6\rSP_Hu_Mix1_090716.fasta C:\Program Files\MSFragger-GUI_v2.6\4234_output\ID22689_04_E749_4234_031116.pepXML
C:\Program Files\MSFragger-GUI_v2.6\philosopher-source_windows_amd64.exe workspace --clean
C:\Program Files\MSFragger-GUI_v2.6\philosopher-source_windows_amd64.exe workspace --init
C:\Program Files\MSFragger-GUI_v2.6\philosopher-source_windows_amd64.exe proteinprophet --output interact --maxppmdiff 20.0 interact-ID22689_04_E749_4234_031116.pep.xml
C:\Program Files\MSFragger-GUI_v2.6\philosopher-source_windows_amd64.exe workspace --clean
C:\Program Files\MSFragger-GUI_v2.6\philosopher-source_windows_amd64.exe workspace --init
C:\Program Files\MSFragger-GUI_v2.6\philosopher-source_windows_amd64.exe database --annotate C:\Program Files\MSFragger-GUI_v2.6\rSP_Hu_Mix1_090716.fasta
C:\Program Files\MSFragger-GUI_v2.6\philosopher-source_windows_amd64.exe filter --mapmods --sequential --pepxml C:\Program Files\MSFragger-GUI_v2.6\4234_output --protxml C:\Program Files\MSFragger-GUI_v2.6\4234_output\interact.prot.xml
C:\Program Files\MSFragger-GUI_v2.6\philosopher-source_windows_amd64.exe report
C:\Program Files\MSFragger-GUI_v2.6\philosopher-source_windows_amd64.exe workspace --clean
Executing command:
$> java -jar C:\Program Files\MSFragger_20170103_v2\MSFragger_20170103\MSFragger.jar C:\Program Files\MSFragger-GUI_v2.6\4234_output\fragger.params C:\Program Files\MSFragger-GUI_v2.6\ID22689_04_E749_4234_031116.mzML
Process started
Peptide index read in 640ms
Selected fragment tolerance 0.02 Da and maximum fragment slice size of 4966.13MB
327469124 fragments to be searched in 1 slices (2.44GB total)
Operating on slice 1 of 1:
10796ms
ID22689_04_E749_4234_031116.mzML
9047ms
ID22689_04_E749_4234_031116.mzML 9047ms [progress: 1453/45449 (3.20%) - 284.40 spectra/s]
ID22689_04_E749_4234_031116.mzML 9047ms [progress: 2248/45449 (4.95%) - 156.56 spectra/s]
ID22689_04_E749_4234_031116.mzML 9047ms [progress: 3009/45449 (6.62%) - 151.71 spectra/s]
ID22689_04_E749_4234_031116.mzML 9047ms [progress: 3776/45449 (8.31%) - 149.66 spectra/s]
ID22689_04_E749_4234_031116.mzML 9047ms [progress: 4562/45449 (10.04%) - 153.37 spectra/s]
ID22689_04_E749_4234_031116.mzML 9047ms [progress: 5304/45449 (11.67%) - 146.12 spectra/s]
ID22689_04_E749_4234_031116.mzML 9047ms [progress: 6045/45449 (13.30%) - 145.01 spectra/s]
ID22689_04_E749_4234_031116.mzML 9047ms [progress: 6792/45449 (14.94%) - 148.48 spectra/s]
ID22689_04_E749_4234_031116.mzML 9047ms [progress: 7520/45449 (16.55%) - 142.91 spectra/s]
ID22689_04_E749_4234_031116.mzML 9047ms [progress: 8282/45449 (18.22%) - 149.15 spectra/s]
ID22689_04_E749_4234_031116.mzML 9047ms [progress: 9031/45449 (19.87%) - 146.60 spectra/s]
ID22689_04_E749_4234_031116.mzML 9047ms [progress: 9784/45449 (21.53%) - 149.20 spectra/s]
ID22689_04_E749_4234_031116.mzML 9047ms [progress: 10539/45449 (23.19%) - 148.68 spectra/s]
ID22689_04_E749_4234_031116.mzML 9047ms [progress: 11281/45449 (24.82%) - 146.09 spectra/s]
ID22689_04_E749_4234_031116.mzML 9047ms [progress: 12008/45449 (26.42%) - 144.50 spectra/s]
ID22689_04_E749_4234_031116.mzML 9047ms [progress: 12716/45449 (27.98%) - 140.73 spectra/s]
ID22689_04_E749_4234_031116.mzML 9047ms [progress: 13403/45449 (29.49%) - 136.55 spectra/s]
ID22689_04_E749_4234_031116.mzML 9047ms [progress: 14088/45449 (31.00%) - 136.56 spectra/s]
ID22689_04_E749_4234_031116.mzML 9047ms [progress: 14762/45449 (32.48%) - 134.37 spectra/s]
ID22689_04_E749_4234_031116.mzML 9047ms [progress: 15408/45449 (33.90%) - 128.40 spectra/s]
ID22689_04_E749_4234_031116.mzML 9047ms [progress: 16058/45449 (35.33%) - 128.79 spectra/s]
ID22689_04_E749_4234_031116.mzML 9047ms [progress: 16735/45449 (36.82%) - 132.51 spectra/s]
ID22689_04_E749_4234_031116.mzML 9047ms [progress: 17432/45449 (38.36%) - 136.83 spectra/s]
ID22689_04_E749_4234_031116.mzML 9047ms [progress: 18125/45449 (39.88%) - 138.16 spectra/s]
ID22689_04_E749_4234_031116.mzML 9047ms [progress: 18839/45449 (41.45%) - 142.37 spectra/s]
ID22689_04_E749_4234_031116.mzML 9047ms [progress: 19581/45449 (43.08%) - 146.12 spectra/s]
ID22689_04_E749_4234_031116.mzML 9047ms [progress: 20333/45449 (44.74%) - 147.62 spectra/s]
ID22689_04_E749_4234_031116.mzML 9047ms [progress: 21104/45449 (46.43%) - 151.35 spectra/s]
ID22689_04_E749_4234_031116.mzML 9047ms [progress: 21871/45449 (48.12%) - 150.13 spectra/s]
ID22689_04_E749_4234_031116.mzML 9047ms [progress: 22639/45449 (49.81%) - 151.69 spectra/s]
ID22689_04_E749_4234_031116.mzML 9047ms [progress: 23413/45449 (51.51%) - 154.34 spectra/s]
ID22689_04_E749_4234_031116.mzML 9047ms [progress: 24359/45449 (53.60%) - 186.26 spectra/s]
ID22689_04_E749_4234_031116.mzML 9047ms [progress: 25334/45449 (55.74%) - 192.61 spectra/s]
ID22689_04_E749_4234_031116.mzML 9047ms [progress: 26317/45449 (57.90%) - 192.97 spectra/s]
ID22689_04_E749_4234_031116.mzML 9047ms [progress: 27268/45449 (60.00%) - 189.63 spectra/s]
ID22689_04_E749_4234_031116.mzML 9047ms [progress: 28220/45449 (62.09%) - 187.44 spectra/s]
ID22689_04_E749_4234_031116.mzML 9047ms [progress: 29224/45449 (64.30%) - 198.97 spectra/s]
ID22689_04_E749_4234_031116.mzML 9047ms [progress: 30385/45449 (66.86%) - 226.54 spectra/s]
ID22689_04_E749_4234_031116.mzML 9047ms [progress: 31560/45449 (69.44%) - 229.94 spectra/s]
ID22689_04_E749_4234_031116.mzML 9047ms [progress: 32742/45449 (72.04%) - 235.65 spectra/s]
ID22689_04_E749_4234_031116.mzML 9047ms [progress: 33902/45449 (74.59%) - 229.89 spectra/s]
ID22689_04_E749_4234_031116.mzML 9047ms [progress: 35185/45449 (77.42%) - 250.34 spectra/s]
ID22689_04_E749_4234_031116.mzML 9047ms [progress: 36839/45449 (81.06%) - 325.65 spectra/s]
ID22689_04_E749_4234_031116.mzML 9047ms [progress: 38503/45449 (84.72%) - 330.75 spectra/s]
ID22689_04_E749_4234_031116.mzML 9047ms [progress: 40197/45449 (88.44%) - 330.54 spectra/s]
ID22689_04_E749_4234_031116.mzML 9047ms [progress: 42563/45449 (93.65%) - 467.40 spectra/s]
ID22689_04_E749_4234_031116.mzML 9047ms [progress: 43956/45449 (96.71%) - 275.13 spectra/s]
ID22689_04_E749_4234_031116.mzML 9047ms [progress: 45438/45449 (99.98%) - 294.57 spectra/s]
ID22689_04_E749_4234_031116.mzML 9047ms [progress: 45449/45449 (100.00%) - 50.23 spectra/s]
- completed 243766ms
Process finished, exit value: 0
Executing command:
$> java -cp C:\Program Files\MSFragger-GUI_v2.6\MSFragger-GUI.jar umich.msfragger.util.FileMove C:\Program Files\MSFragger-GUI_v2.6\ID22689_04_E749_4234_031116.pepXML C:\Program Files\MSFragger-GUI_v2.6\4234_output\ID22689_04_E749_4234_031116.pepXML
Process started
Process finished, exit value: 0
Executing command:
$> C:\Program Files\MSFragger-GUI_v2.6\philosopher-source_windows_amd64.exe workspace --init
Process started
INFO[15:21:09] Creating workspace
WARN[15:21:09] existing workspace detected, will not overwrite
INFO[15:21:09] Done
Process finished, exit value: 0
Executing command:
$> C:\Program Files\MSFragger-GUI_v2.6\philosopher-source_windows_amd64.exe peptideprophet --nonparam --expectscore --decoy rev --decoyprobs --masswidth 1000.0 --clevel -2 --database C:\Program Files\MSFragger-GUI_v2.6\rSP_Hu_Mix1_090716.fasta C:\Program Files\MSFragger-GUI_v2.6\4234_output\ID22689_04_E749_4234_031116.pepXML
Process started
Failed to open input file 'C:\Program Files\MSFragger-GUI_v2.6\4234_output/ID22689_04_E749_4234_031116.mzXML'.
WARNING: cannot open data file C:\Program Files\MSFragger-GUI_v2.6\4234_output/ID22689_04_E749_4234_031116.mzXML in msms_run_summary tag... trying .mzML ...
Failed to open input file 'C:\Program Files\MSFragger-GUI_v2.6\4234_output/ID22689_04_E749_4234_031116.mzML'.
WARNING: CANNOT correct data file C:\Program Files\MSFragger-GUI_v2.6\4234_output/ID22689_04_E749_4234_031116.mzML in msms_run_summary tag...
Failed to open input file 'C:\Program Files\MSFragger-GUI_v2.6\4234_output/ID22689_04_E749_4234_031116.mzXML'.
WARNING: cannot open data file C:\Program Files\MSFragger-GUI_v2.6\4234_output/ID22689_04_E749_4234_031116.mzXML in msms_run_summary tag... trying .mzML ...
Failed to open input file 'C:\Program Files\MSFragger-GUI_v2.6\4234_output/ID22689_04_E749_4234_031116.mzML'.
WARNING: CANNOT correct data file C:\Program Files\MSFragger-GUI_v2.6\4234_output/ID22689_04_E749_4234_031116.mzML in msms_run_summary tag...
file 1: C:\Program Files\MSFragger-GUI_v2.6\4234_output\ID22689_04_E749_4234_031116.pepXML
processed altogether 39233 results
INFO: Results written to file: C:\Program Files\MSFragger-GUI_v2.6\4234_output\interact-ID22689_04_E749_4234_031116.pep.xml
- C:\Program Files\MSFragger-GUI_v2.6\4234_output\interact-ID22689_04_E749_4234_031116.pep.xml
- Building Commentz-Walter keyword tree...
- Searching the tree...
- Linking duplicate entries...
- Printing results...
Using Decoy Label "rev".
Decoy Probabilities will be reported.
Using non-parametric distributions
(X! Tandem) (using Tandem's expectation score for modeling)
init with X! Tandem trypsin
PeptideProphet (TPP v5.0.1 Post-Typhoon dev, Build 201705191533-7588 (Windows_NT-x86_64)) AKeller@ISB
read in 0 1+, 15898 2+, 13530 3+, 6051 4+, 2937 5+, 785 6+, and 24 7+ spectra.
Found 0 Decoys, and 39225 Non-Decoys
WARNING: No decoys with label rev were found in this dataset. reverting to fully unsupervised method.
negmean = 0.0533258
MS Instrument info: Manufacturer: UNKNOWN, Model: UNKNOWN, Ionization: UNKNOWN, Analyzer: UNKNOWN, Detector: UNKNOWN
INFO: Processing standard MixtureModel ...
Initialising statistical models ...
INFO[15:21:23] Done
Process finished, exit value: 0
Executing command:
$> C:\Program Files\MSFragger-GUI_v2.6\philosopher-source_windows_amd64.exe workspace --clean
Process started
INFO[15:21:24] Removing workspace
WARN[15:21:24] cannot remove the meta data: remove .meta\meta.bin: The process cannot access the file because it is being used by another process.
INFO[15:21:24] Done
Process finished, exit value: 0
Executing command:
$> C:\Program Files\MSFragger-GUI_v2.6\philosopher-source_windows_amd64.exe workspace --init
Process started
INFO[15:21:24] Creating workspace
WARN[15:21:24] existing workspace detected, will not overwrite
INFO[15:21:24] Done
Process finished, exit value: 0
Executing command:
$> C:\Program Files\MSFragger-GUI_v2.6\philosopher-source_windows_amd64.exe proteinprophet --output interact --maxppmdiff 20.0 interact-ID22689_04_E749_4234_031116.pep.xml
Process started
ProteinProphet (C++) by Insilicos LLC and LabKey Software, after the original Perl by A. Keller (TPP v5.0.1 Post-Typhoon dev, Build 201705191533-7588 (Windows_NT-x86_64))
(no FPKM) (using degen pep info)
Reading in C:/Program Files/MSFragger-GUI_v2.6/4234_output/interact-ID22689_04_E749_4234_031116.pep.xml...
did not find any PeptideProphet results in input data! Did you forget to run PeptideProphet?
...read in 0 1+, 0 2+, 0 3+, 0 4+, 0 5+, 0 6+, 0 7+ spectra with min prob 0
WARNING: no data - output file will be empty
INFO[15:21:27] Done
Process finished, exit value: 0
Executing command:
$> C:\Program Files\MSFragger-GUI_v2.6\philosopher-source_windows_amd64.exe workspace --clean
Process started
INFO[15:21:27] Removing workspace
WARN[15:21:27] cannot remove the meta data: remove .meta\meta.bin: The process cannot access the file because it is being used by another process.
INFO[15:21:27] Done
Process finished, exit value: 0
Executing command:
$> C:\Program Files\MSFragger-GUI_v2.6\philosopher-source_windows_amd64.exe workspace --init
Process started
INFO[15:21:27] Creating workspace
WARN[15:21:27] existing workspace detected, will not overwrite
INFO[15:21:27] Done
Process finished, exit value: 0
Executing command:
$> C:\Program Files\MSFragger-GUI_v2.6\philosopher-source_windows_amd64.exe database --annotate C:\Program Files\MSFragger-GUI_v2.6\rSP_Hu_Mix1_090716.fasta
Process started
INFO[15:21:28] Processing database
FATA[15:21:28] Cannot identify the database type
Process finished, exit value: 1
Executing command:
$> C:\Program Files\MSFragger-GUI_v2.6\philosopher-source_windows_amd64.exe filter --mapmods --sequential --pepxml C:\Program Files\MSFragger-GUI_v2.6\4234_output --protxml C:\Program Files\MSFragger-GUI_v2.6\4234_output\interact.prot.xml
Process started
INFO[15:21:29] Processing peptide identification files
INFO[15:21:32] 1+ Charge profile decoy=0 target=0
INFO[15:21:32] 2+ Charge profile decoy=0 target=15904
INFO[15:21:32] 3+ Charge profile decoy=0 target=13530
INFO[15:21:32] 4+ Charge profile decoy=0 target=6051
INFO[15:21:32] 5+ Charge profile decoy=0 target=2937
INFO[15:21:32] 6+ Charge profile decoy=0 target=785
INFO[15:21:32] Database search results ions=23212 peptides=21015 psms=39233
INFO[15:21:32] Converged to 0.00 % FDR with 39233 PSMs decoy=0 threshold=0 total=39233
INFO[15:21:32] Converged to 0.00 % FDR with 21015 Peptides decoy=0 threshold=0 total=21015
INFO[15:21:33] Converged to 0.00 % FDR with 23212 Ions decoy=0 threshold=0 total=23212
FATA[15:21:33] No Protein groups detected, check your file and try again
Process finished, exit value: 1
Executing command:
$> C:\Program Files\MSFragger-GUI_v2.6\philosopher-source_windows_amd64.exe report
Process started
INFO[15:21:33] Creating PSM report
INFO[15:21:33] Creating peptide Ion report
INFO[15:21:33] Creating peptide report
INFO[15:21:33] Done
Process finished, exit value: 0
Executing command:
$> C:\Program Files\MSFragger-GUI_v2.6\philosopher-source_windows_amd64.exe workspace --clean
Process started
INFO[15:21:34] Removing workspace
WARN[15:21:34] cannot remove the meta data: remove .meta\meta.bin: The process cannot access the file because it is being used by another process.
INFO[15:21:34] Done
Process finished, exit value: 0
=========================
===
=== Done
===
=========================
As far as i can tell the problem lies in the fact that the input file: C:\Program Files\MSFragger-GUI_v2.6\ID22689_04_E749_4234_031116.mzML
; Is located in a directory other than the output directory, therefore when philosopher runs on the output dir: C:\Program Files\MSFragger-GUI_v2.6\4234_output/
, it is not able to find the corresponding mzML file (because it is not looking in the right place). This could be solved manually by setting the input and output directories as the same folder (while the developers implement searching for the files in the formerly specified input).
In addition I can note that when it runs
Executing command:
$> C:\Program Files\MSFragger-GUI_v2.6\philosopher-source_windows_amd64.exe database --annotate C:\Program Files\MSFragger-GUI_v2.6\rSP_Hu_Mix1_090716.fasta
Process started
INFO[15:21:28] Processing database
FATA[15:21:28] Cannot identify the database type
Process finished, exit value: 1
it does not process the database to generate the decoys, which would explain the later lack of decoy matches:
INFO[15:21:32] 1+ Charge profile decoy=0 target=0
INFO[15:21:32] 2+ Charge profile decoy=0 target=15904
INFO[15:21:32] 3+ Charge profile decoy=0 target=13530
INFO[15:21:32] 4+ Charge profile decoy=0 target=6051
INFO[15:21:32] 5+ Charge profile decoy=0 target=2937
INFO[15:21:32] 6+ Charge profile decoy=0 target=785
INFO[15:21:32] Database search results ions=23212 peptides=21015 psms=39233
INFO[15:21:32] Converged to 0.00 % FDR with 39233 PSMs decoy=0 threshold=0 total=39233
INFO[15:21:32] Converged to 0.00 % FDR with 21015 Peptides decoy=0 threshold=0 total=21015
INFO[15:21:33] Converged to 0.00 % FDR with 23212 Ions decoy=0 threshold=0 total=23212
FATA[15:21:33] No Protein groups detected, check your file and try again
Process finished, exit value: 1
I hope this can be easily fixed.
Those warnings about mzm/mzxml are not critical. The raw files are not really needed, it's ok. On Linux it actually should create symlink to raw files in the output directory, on Windows creating symlinks requires special permissions and we decided against copying whole files.
The program won't create a database with decoys for you (philosopher database --annotate serves a different purpose), you're expected to provide a database yourself. Philosopher does have the tools bundled that can download a db and append decoys for you.
The problem is that "philosopher database --annotate" could not parse the DB. It most likely didn't like something about protein description strings. We're working on a fix.
Hi,
Thanks for the help. We also were wondering whether or not setting the additional modifications to 0, as done in the screenshot below, means that any modification is open to be searched for. Or should they be unchecked?
Additional Modifications are essentially fixed modifications, it's like modifying the mass of an amino acid (or terminus) by a fixed value.
The checkboxes are there for convenience - if you uncheck a box, it won't be included in the parameter file that is written for MSFragger, but the delta mass value won't be forgotten, so you can reactivate it later. Having a value of zero is the same as unchecking the box.
The same goes for checkboxes of Variable Mods. They're there for convenience, so that you didn't have to retype specificity and exact mass values, in case you want to just try turn them off temporarily.
If you fixed Cysteines chemically, I'd recommend setting the additional mod for C to 57.021464, otherwise all the mass-shifts for Cysteine-containing peptides will be off by 57, but it's up to you.
For each run, if MSFragger was set to run, you will find the .param file that was used for the search in the output directory. You can also click Save button at the top of this page to just save the parameter file separately - you can check the effects of various options on this configuration page that way. You can also load previously saved param files (or param files from previously used output directories).
If you inspect the log carefully you will see that not only Philosopher failed but also PeptideProphet and ProteinProphet. It seems that the problem stems from a badly formatted database file.
@azhang8; if you share your database file I may be able to guide you on how to fix it.
@azhang8 Has the issue been resolved?
@azhang8 I replied to your e-mail on Jun 22 with the instructions you have to follow, please check your e-mail.
Hi, I have similar issues with a fasta file. Please see below. Tried to generate a combined forward and reverse, did it in this format:
>sp|Q13542|4EBP2_HUMAN Eukaryotic translation initiation factor 4E-binding protein 2 OS=Homo sapiens GN=EIF4EBP2 PE=1 SV=1
MSSSAGSGHQPSQSRAIPTRTVAISDAAQLPHDYCTTPGGTLFSTTPGGTRIIYDRKFLLDRRNSPMAQTPPCHLPNIPGVTSPGTLIEDSKVEVNNLNNLNNHDRKHAVGDDAQFEMDI
>sp|REV_Q13542|4EBP2_HUMAN Eukaryotic translation initiation factor 4E-binding protein 2 OS=Homo sapiens GN=EIF4EBP2 PE=1 SV=1
PRPLPNRCFPGDNSEHIQDAIDAPQFLYGDHDYARPMKDRGISSLSVRLLTPTSPGTRGPSRTAMTPHNLSADTSTHNDECTTAVIVKLNIKNIVEQLVSNAGDNFHAPTGMGSSTQGIQ
It does not recognize the reverse, can anyone help?
Also got a couple of warnings re peptide/protein phophets. Is the issue with reading rev hits causing it? Please see below:
Will execute 12 commands:
java -jar -Xmx8G C:\Users\rserwa\Desktop\working directory fragger\MSFragger.jar C:\Users\rserwa\Desktop\working directory fragger\outcome\fragger.params C:\Users\rserwa\Desktop\working directory fragger\outcome\RS_MCF-1.mzML
philosopher_windows_amd64.exe workspace --init
philosopher_windows_amd64.exe peptideprophet --nonparam --expectscore --decoy rev --decoyprobs --masswidth 1000.0 --clevel 2 --database C:\Users\rserwa\Desktop\working directory fragger\human_fragger.fasta C:\Users\rserwa\Desktop\working directory fragger\outcome\RS_MCF-1.pepXML
philosopher_windows_amd64.exe workspace --clean
philosopher_windows_amd64.exe workspace --init
philosopher_windows_amd64.exe proteinprophet --output interact --maxppmdiff 20 interact-RS_MCF-1.pep.xml
philosopher_windows_amd64.exe workspace --clean
philosopher_windows_amd64.exe workspace --init
philosopher_windows_amd64.exe database --annotate C:\Users\rserwa\Desktop\working directory fragger\human_fragger.fasta
philosopher_windows_amd64.exe filter --sequential --mapmods --pepxml C:\Users\rserwa\Desktop\working directory fragger\outcome --protxml C:\Users\rserwa\Desktop\working directory fragger\outcome\interact.prot.xml
philosopher_windows_amd64.exe report
philosopher_windows_amd64.exe workspace --clean
~~~~~~~~~~~~~~~~~~~~~~
Executing command:
$> java -jar -Xmx8G C:\Users\rserwa\Desktop\working directory fragger\MSFragger.jar C:\Users\rserwa\Desktop\working directory fragger\outcome\fragger.params C:\Users\rserwa\Desktop\working directory fragger\outcome\RS_MCF-1.mzML
Process started
Peptide index read in 107ms
Selected fragment tolerance 0.02 Da and maximum fragment slice size of 4985.45MB
171161740 fragments to be searched in 1 slices (1.27GB total)
Operating on slice 1 of 1:
4167ms
RS_MCF-1.mzML
8245ms
RS_MCF-1.mzML 8245ms [progress: 2152/29397 (7.32%) - 426.14 spectra/s]
RS_MCF-1.mzML 8245ms [progress: 4216/29397 (14.34%) - 409.44 spectra/s]
RS_MCF-1.mzML 8245ms [progress: 6293/29397 (21.41%) - 406.86 spectra/s]
RS_MCF-1.mzML 8245ms [progress: 8393/29397 (28.55%) - 409.92 spectra/s]
RS_MCF-1.mzML 8245ms [progress: 10310/29397 (35.07%) - 380.06 spectra/s]
RS_MCF-1.mzML 8245ms [progress: 12389/29397 (42.14%) - 410.63 spectra/s]
RS_MCF-1.mzML 8245ms [progress: 14484/29397 (49.27%) - 417.25 spectra/s]
RS_MCF-1.mzML 8245ms [progress: 16735/29397 (56.93%) - 444.33 spectra/s]
RS_MCF-1.mzML 8245ms [progress: 19094/29397 (64.95%) - 467.50 spectra/s]
RS_MCF-1.mzML 8245ms [progress: 21529/29397 (73.24%) - 478.77 spectra/s]
RS_MCF-1.mzML 8245ms [progress: 24137/29397 (82.11%) - 520.35 spectra/s]
RS_MCF-1.mzML 8245ms [progress: 27113/29397 (92.23%) - 585.02 spectra/s]
RS_MCF-1.mzML 8245ms [progress: 29397/29397 (100.00%) - 553.16 spectra/s]
- completed 64933ms
Process finished, exit value: 0
Executing command:
$> philosopher_windows_amd64.exe workspace --init
Process started
INFO[16:10:13] Creating workspace
INFO[16:10:13] Done
Process finished, exit value: 0
Executing command:
$> philosopher_windows_amd64.exe peptideprophet --nonparam --expectscore --decoy rev --decoyprobs --masswidth 1000.0 --clevel 2 --database C:\Users\rserwa\Desktop\working directory fragger\human_fragger.fasta C:\Users\rserwa\Desktop\working directory fragger\outcome\RS_MCF-1.pepXML
Process started
Failed to open input file 'C:\Users\rserwa\Desktop\working directory fragger\outcome/RS_MCF-1.mzXML'.
WARNING: cannot open data file C:\Users\rserwa\Desktop\working directory fragger\outcome/RS_MCF-1.mzXML in msms_run_summary tag... trying .mzML ...
SUCCESS: CORRECTED data file C:\Users\rserwa\Desktop\working directory fragger\outcome/RS_MCF-1.mzML in msms_run_summary tag...
file 1: C:\Users\rserwa\Desktop\working directory fragger\outcome\RS_MCF-1.pepXML
processed altogether 23312 results
INFO: Results written to file: C:\Users\rserwa\Desktop\working directory fragger\outcome\interact-RS_MCF-1.pep.xml
- C:\Users\rserwa\Desktop\working directory fragger\outcome\interact-RS_MCF-1.pep.xml
- Building Commentz-Walter keyword tree...
- Searching the tree...
- Linking duplicate entries...
- Printing results...
Using Decoy Label "rev".
Decoy Probabilities will be reported.
Using non-parametric distributions
(X! Tandem) (using Tandem's expectation score for modeling)
init with X! Tandem trypsin
PeptideProphet (TPP v5.0.1 Post-Typhoon dev, Build 201705191533-7588 (Windows_NT-x86_64)) AKeller@ISB
read in 0 1+, 11799 2+, 9109 3+, 2010 4+, 324 5+, 48 6+, and 17 7+ spectra.
Found 0 Decoys, and 23307 Non-Decoys
WARNING: No decoys with label rev were found in this dataset. reverting to fully unsupervised method.
negmean = 0.0533258
MS Instrument info: Manufacturer: UNKNOWN, Model: UNKNOWN, Ionization: UNKNOWN, Analyzer: UNKNOWN, Detector: UNKNOWN
INFO: Processing standard MixtureModel ...
Initialising statistical models ...
INFO[16:10:19] Done
Process finished, exit value: 0
Executing command:
$> philosopher_windows_amd64.exe workspace --clean
Process started
INFO[16:10:20] Removing workspace
WARN[16:10:20] cannot remove the meta data: remove .meta\meta.bin: The process cannot access the file because it is being used by another process.
INFO[16:10:20] Done
Process finished, exit value: 0
Executing command:
$> philosopher_windows_amd64.exe workspace --init
Process started
INFO[16:10:20] Creating workspace
WARN[16:10:20] existing workspace detected, will not overwrite
INFO[16:10:20] Done
Process finished, exit value: 0
Executing command:
$> philosopher_windows_amd64.exe proteinprophet --output interact --maxppmdiff 20 interact-RS_MCF-1.pep.xml
Process started
ProteinProphet (C++) by Insilicos LLC and LabKey Software, after the original Perl by A. Keller (TPP v5.0.1 Post-Typhoon dev, Build 201705191533-7588 (Windows_NT-x86_64))
(no FPKM) (using degen pep info)
Reading in C:/Users/rserwa/Desktop/working directory fragger/outcome/interact-RS_MCF-1.pep.xml...
did not find any PeptideProphet results in input data! Did you forget to run PeptideProphet?
...read in 0 1+, 0 2+, 0 3+, 0 4+, 0 5+, 0 6+, 0 7+ spectra with min prob 0.05
WARNING: no data - output file will be empty
INFO[16:10:22] Done
Process finished, exit value: 0
Executing command:
$> philosopher_windows_amd64.exe workspace --clean
Process started
INFO[16:10:22] Removing workspace
WARN[16:10:22] cannot remove the meta data: remove .meta\meta.bin: The process cannot access the file because it is being used by another process.
INFO[16:10:22] Done
Process finished, exit value: 0
Executing command:
$> philosopher_windows_amd64.exe workspace --init
Process started
INFO[16:10:22] Creating workspace
WARN[16:10:22] existing workspace detected, will not overwrite
INFO[16:10:22] Done
Process finished, exit value: 0
Executing command:
$> philosopher_windows_amd64.exe database --annotate C:\Users\rserwa\Desktop\working directory fragger\human_fragger.fasta
Process started
INFO[16:10:23] Processing database
INFO[16:10:26] Done
Process finished, exit value: 0
Executing command:
$> philosopher_windows_amd64.exe filter --sequential --mapmods --pepxml C:\Users\rserwa\Desktop\working directory fragger\outcome --protxml C:\Users\rserwa\Desktop\working directory fragger\outcome\interact.prot.xml
Process started
INFO[16:10:26] Processing peptide identification files
INFO[16:10:28] 1+ Charge profile decoy=0 target=0
INFO[16:10:28] 2+ Charge profile decoy=0 target=11804
INFO[16:10:28] 3+ Charge profile decoy=0 target=9109
INFO[16:10:28] 4+ Charge profile decoy=0 target=2010
INFO[16:10:28] 5+ Charge profile decoy=0 target=324
INFO[16:10:28] 6+ Charge profile decoy=0 target=48
INFO[16:10:28] Database search results ions=18462 peptides=16274 psms=23312
INFO[16:10:28] Converged to 0.00 % FDR with 23312 PSMs decoy=0 threshold=0 total=23312
INFO[16:10:28] Converged to 0.00 % FDR with 16274 Peptides decoy=0 threshold=0 total=16274
INFO[16:10:28] Converged to 0.00 % FDR with 18462 Ions decoy=0 threshold=0 total=18462
FATA[16:10:28] No Protein groups detected, check your file and try again
Process finished, exit value: 1
Executing command:
$> philosopher_windows_amd64.exe report
Process started
INFO[16:10:28] Creating PSM report
INFO[16:10:28] Creating peptide Ion report
INFO[16:10:29] Creating peptide report
INFO[16:10:29] Done
Process finished, exit value: 0
Executing command:
$> philosopher_windows_amd64.exe workspace --clean
Process started
INFO[16:10:29] Removing workspace
WARN[16:10:29] cannot remove the meta data: remove .meta\meta.bin: The process cannot access the file because it is being used by another process.
INFO[16:10:29] Done
Process finished, exit value: 0
=========================
===
=== Done
===
=========================
Thanks in advance, Remi
@remigs
Hi Remi,
Try doing rev_sp|ABC123...
- i.e. lowercase prefix. There are many tools involved here and the older TPP tools wrapped in Philosopher expect the reverse identifier to be a prefix to the whole string, I guess.
Thanks Dmitry, it worked!
Greetings MSFragger developers, Thanks to your suggestions about fasta file formatting I have now been able to test the software a bit. Below are my comments and and some questions. Just to explain my interest in MSFragger, I am a chemoproteomics person and for my research it is essential to be able to find modifications not defined a priori. From your recent Nat Comms paper I understood that this is doable with MSFragger and I am very enthusiastic about applying this tool routinely in many of my projects. By the way, thanks for developing it!
So far I have set up a couple of open searches in MSFragger GUI using the preconfigured settings (as below)
I searched 3 different mzML files but with limited success. I am impressed by the processing speed of the software but cannot find modified peptides which I know are there. I understand that this may be partially due to sub-optimal parameters I use, and therefore would like to ask experts opinion of which parameters I should be changing with respect to the 3 examples I listed below. Since the program runs so fast, I guess I running multiple searches with changed parameters is a viable option, it this something you would recommend? For the examples used in Nat Comms, have all open searches been run with the same set of parameters?
Each of the 3 test files I had previously searched using other software packages (with exactly the same forward+decoy fasta) and found plenty of modified peptides by simply setting variable modifications corresponding toDmasses, which I introduced to peptides chemically or biochemically.
My samples were:
Using the preconfigured settings I was able to see 11 instances of thisDmass whereas other software returned over 60 of well assigned spectra. I was able to easily find the PSMs in the output tvs tables, great! In terms of total number of PSMs, MSFragger reported 1.6k and the othersoftware twice as much at FDR 0.01.
I changed Precursor True Tolerance from 20ppm to 5ppm (I fix mass on an internal calibrant during acquisition and rarely observe PSMs with errors >5ppm), but that did not help to bring up the number ofDmass -4.99modifications found.
I then thought I would see if I can boost the number of Dmass-4.99 PSMs by setting this Dmass as variable modification. This is not at all what I would like to do as I mainly care about finding and validating unpredictable modification, but just as an exercise for the software I decided to do it. I tried to run the search with this modification added (as below) but got the message below and the GUI hanged.
Peptide index written in 2293ms
Selected fragment tolerance 0.02 Da and maximum fragment slice size of 10779.80MB
Exception in thread "main" java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
at java.lang.reflect.Method.invoke(Unknown Source)
at org.eclipse.jdt.internal.jarinjarloader.JarRsrcLoader.main(JarRsrcLoader.java:58)
Caused by: java.util.concurrent.ExecutionException: java.lang.ArrayIndexOutOfBoundsException: -248
at java.util.concurrent.FutureTask.report(Unknown Source)
at java.util.concurrent.FutureTask.get(Unknown Source)
at e.b(Unknown Source)
at MSFragger.main(Unknown Source)
... 5 more
Caused by: java.lang.ArrayIndexOutOfBoundsException: -248
at f.call(Unknown Source)
at java.util.concurrent.FutureTask.run(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)
427004992 fragments to be searched in 1 slices (3.18GB total)
Operating on slice 1 of 1:”
Tried three times, all failed. How can this be fixed?
Using the preconfigured settings I was able to see 1 instance of thisDmass and only 12 PMSs in total. I then set variable modification Dmass +229.16 as shown below and was happy not to receive any error messages (in contrast to what happened for the Met->AhaDmass in example 1).
This search returned ca 6k PSMs, about 50% of the number returned by other software at FDR 0.01. What surprised me though was that variable modification does not appear as modifications in the histogram output, in other words the top peak is shifted to 0 (as shown below).
Is that always the case? What would this picture look like if 45% of all PSM TMT modified, 49% unmodified, and 6% modified with TMT and Met(ox)?
Using the preconfigured settings I was able to see 2 instances of thisDmass whereas other software returned over 100 of well assigned spectra when set as variable modification. In terms of total number of PSMs, MSFragger reported ca 6k and the other software ca 9k at FDR 0.01. I am yet to overlay protein and peptide IDs but looking briefly at Protein IDs I can see that MSFragger did a good job finding proteins that should be there based on the sample specification.
The fact of MSFragger being able to find less than 2% of Dmass +463.2907 PSMsis alarming. Could it be that the software does not operate correctly near the search window frame boarders? The default search limits are Dmasses+/-500Da and my modification is +463.2907,and if combined with cysteine carbamidomethylation or Met(ox) the total Dmass would fall beyond the preset window.
I decided therefore to increase the Precursor Mass Parameter to 1000 ABS
and got in very mild increase of Dmass+463.2907 PSMs, I could see 3 instead of 2 instances.
Total number of PSMs did not change.
Furthermore, I was not able to inspect data for PSMs associated with Dmass >500 as neither the histogram output nor the tsv tables contain information aboutDmass >500.
Have I set the search properly? How can I see information aboutDmass >500?
I then added Dmass +463.2907 to variable modifications just to see if the software improves on detection of these peptides and this resulted in an interesting observation.
There was only one Dmass +463.2907 PSM reported BUT over 40 of Dmass -463.2907 PSMs (as shown below).
Fair enough, some bug I thought that may be fixable? In any case, I do not mind looking on the other side of the x-axis, but the problem was bigger than I initially realised. Even though I could see on output histograms modification I expected to see I could not find the corresponding IDs in the output tables. I double search all of them and found no indication of my Dmasses. It seems that the only PSMs listed are unmodified ones and those, for whichDmassescould be matched to their descriptions. How can I access information about the PMSs with unnamed Dmasses? Furthermore, this example illustrates that using pre-set open search settings theDmass +463.2907 PSM cannot be efficiently identified by MSFragger without setting it as variable modification. A couple of my project deal with identifying protein adducts of similar type and mass. Please advise on settings to improve detection of these PTMs.
Lastly (a long shot question, and I will gladly accept a simple no to that one), I wanted to ask whether diagnostic fragment ion feature would be easily implementable in MSFragger and if possibly already available for testing? To explain the application, I would say that if a small molecule that modifies proteins (either enzymatically or chemically) has a metabolically stable region of its structure, and if that structural element happens to produce characteristic HCD fragmentation product(s), and at the same time the molecule has heavily metabolizable structural regions, the characteristic HCD ion(s) can be used to aid identifying protein adducts of a spectrum of metabolites derived from that small molecule. Please let me know if you would like to know more about it.
Sorry for the length of this post but I tried to explain as plainly as possible the issues I encounter and hope to hear from you and be able to use the full potential of MSFragger soon. Best regards, Remi
Sorry the pictures did not copy at all and some elements of text did copy properly, I working from iPad, will try to add them from a PCs when I get access to one.
the attached pdf file contains all the screenshots I refer to in my post observations about fragger.pdf
HI, could anyone please share a instruction for running the MSFragger GUI 3.0 I have installed but could not get the MSFragger.jar file and thus could not run the GUI to test.
Thanks, Trayambak.
@remigs Sorry, it seems that your issue was neglected amidst all the other posts. Once something is specified as a variable modification, then it no longer appears as a delta mass and will show up as a 0 delta mass PSM with the variable mod. Delta mass only refers to masses that are not accounted for using variable modifications.
Open searching is still an emerging method so there's no catch-all set of parameters that we can recommend. We've had others who had great success in using MSFragger for chemoprotoemics so if you're interested in debugging with us to get MSFragger working for your chemistry, just send me an e-mail at andykong at umich.edu with your data so we can take a look together.
@trayambakbasak There should be a link in the GUI for downloading MSFragger.jar from our Tech Transfer site.
Hi,
We are currently trying to run a file and it is failing and displaying the following errors:
Executing command: $> C:\Users\DPCF_SuperComp_v2\Desktop\MSFragger\philosopher-source_windows_amd64.exe peptideprophet --nonparam --expectscore --decoy rev --decoyprobs --masswidth 1000.0 --clevel -2 --database C:\Users\DPCF_SuperComp_v2\Desktop\MSFragger\rSP_Hu_Mix1_090716.fasta C:\Users\DPCF_SuperComp_v2\Desktop\MSFragger\output\ID22689_04_E749_4234_031116.pepXML Process started Failed to open input file 'C:\Users\DPCF_SuperComp_v2\Desktop\MSFragger\output/ID22689_04_E749_4234_031116.mzXML'. WARNING: cannot open data file C:\Users\DPCF_SuperComp_v2\Desktop\MSFragger\output/ID22689_04_E749_4234_031116.mzXML in msms_run_summary tag... trying .mzML ... Failed to open input file 'C:\Users\DPCF_SuperComp_v2\Desktop\MSFragger\output/ID22689_04_E749_4234_031116.mzML'. WARNING: CANNOT correct data file C:\Users\DPCF_SuperComp_v2\Desktop\MSFragger\output/ID22689_04_E749_4234_031116.mzML in msms_run_summary tag... Failed to open input file 'C:\Users\DPCF_SuperComp_v2\Desktop\MSFragger\output/ID22689_04_E749_4234_031116.mzXML'. WARNING: cannot open data file C:\Users\DPCF_SuperComp_v2\Desktop\MSFragger\output/ID22689_04_E749_4234_031116.mzXML in msms_run_summary tag... trying .mzML ... Failed to open input file 'C:\Users\DPCF_SuperComp_v2\Desktop\MSFragger\output/ID22689_04_E749_4234_031116.mzML'. WARNING: CANNOT correct data file C:\Users\DPCF_SuperComp_v2\Desktop\MSFragger\output/ID22689_04_E749_4234_031116.mzML in msms_run_summary tag...
Is there something wrong with the input we are giving MSFragger?
Thanks,
Austin