MSGFPlus / msgfplus

MS-GF+ (aka MSGF+ or MSGFPlus) performs peptide identification by scoring MS/MS spectra against peptides derived from a protein sequence database.
Other
76 stars 36 forks source link

Error in generating decoy files #84

Closed bjgprt closed 4 years ago

bjgprt commented 4 years ago

Hi,

I get error in generating the decoy fasta files with both the Feb 5 version and the July 3 version. Could you help me take a look at it, please? Thanks!

java -version openjdk version "1.8.0_232" OpenJDK Runtime Environment (build 1.8.0_232-b09) OpenJDK 64-Bit Server VM (build 25.232-b09, mixed mode)

java -Xmx128g -cp ${msgfplus} edu.ucsd.msjava.msdbsearch.BuildSA -d ${fasta} -tda 2 -decoy XXX (or -decoy REV) Creating uniprot_aa.revCat.fasta. Exception in thread "main" java.lang.NullPointerException at edu.ucsd.msjava.msdbsearch.ReverseDB.reverseDB(ReverseDB.java:91) at edu.ucsd.msjava.msdbsearch.BuildSA.buildSAFiles(BuildSA.java:137) at edu.ucsd.msjava.msdbsearch.BuildSA.buildSA(BuildSA.java:96) at edu.ucsd.msjava.msdbsearch.BuildSA.main(BuildSA.java:56)

Same error with the following commands: java -Xmx60g -jar ${msgfplus} -s ${mgf} -d ${fasta} -inst 1 -t 15ppm -ti -1,2 -mod ${mods_doc} -ntt 1 -tda 1 -maxCharge 8 -minCharge 1 -addFeatures 1 -n 1

FarmGeek4Life commented 4 years ago

How big is your fasta file? This is a known issue for fasta files that are over 2GB. Fixing it is not a simple task.

bjgprt commented 4 years ago

How big is your fasta file? This is a known issue for fasta files that are over 2GB. Fixing it is not a simple task.

I have tried with fasta files including one with 1000 lines and one has 358M. But both end up with the same error.

bjgprt commented 4 years ago

Could you provide me a sample *.revCat.fasta file that I can generate one by myself, please? Just want to know the default format of the target-decoy file. Thanks!

alchemistmatt commented 4 years ago

Go to https://github.com/MSGFPlus/msgfplus/releases

Download MSGFPlus_v20190703.zip

Extract the files

Go to https://adoptopenjdk.net/

Download OpenJDK 8 compatible with your OS

Run the installer

Index a FASTA file, e.g.

java.exe -Xmx3500M -cp MSGFPlus.jar edu.ucsd.msjava.msdbsearch.BuildSA -d Tryp_Pig_Bov.fasta -tda 2

Example result files, including revCat files, can be found at:

I highly advise that you let MS-GF+ create these index files; do not create them yourself. Many, many people have successfully used MS-GF+ to index FASTA files and search spectra files against the indexed FASTA. You need to figure out why the index creation process is failing for you.

If you want to send us your FASTA file to us, send your e-mail address to proteomics@pnnl.gov and we will get back to you with a link that you can use to upload your FASTA file.