MSGFPlus / msgfplus

MS-GF+ (aka MSGF+ or MSGFPlus) performs peptide identification by scoring MS/MS spectra against peptides derived from a protein sequence database.
Other
76 stars 36 forks source link

MzIDToTsv error: Exception in thread "main" java.lang.IllegalStateException: Error initializing ID cache: No id attribute found for element null #128

Closed SophieSWChoi closed 3 years ago

SophieSWChoi commented 3 years ago

Hi, I'm new to proteomics and I've been trying to detect novel peptides with MSGF+. When I use Uniprot fasta file as database and use MzIDToTsv (MSGFPlus_v20200805/MSGFPlus.jar edu.ucsd.msjava.ui.MzIDToTsv) to convert mzid to tsv, it works fine. But when I use my custom fasta file and try to convert the resulting mzid to tsv, suddenly it won't work. If I use the stand-alone one in Windows it works fine, but I have more than one thousand mzid files to convert and the resulting tsv formats are a little different than the ones I used to get so I'm not very kin on using the stand-alone version of converting tool. I'd very much appreciate any help.

Example of my custom fasta file:

chr1:10032076:10041228:+_circRNA1&chr1:10032076:10041228:+_circRNA4&chr1:10032076:10041228:+_circRNA7 KVLRHHQEKLEASDCDHQQNSPTLERPGRKRKWTETQDSSQKKSLEPKTKDNKGGVTVFHLDQQLQVLTMENSEKTEVVLLACGSFNPITNMHLRLFELA

Error messege: MzIDToTsv v9108 (5 August 2020) Java 11.0.5 (Oracle Corporation) Linux (amd64, version 3.10.0-1062.9.1.el7.x86_64) Converting FN05_N179T180_180min_10ug_C1_032714_filt1.mzid into FN05_N179T180_180min_10ug_C1_032714_filt1.tsv Exception in thread "main" java.lang.IllegalStateException: Error initializing ID cache: No id attribute found for element null at uk.ac.ebi.jmzidml.xml.xxindex.MzIdentMLIndexerFactory$MzIdentMLIndexerImpl.initIdMapCache(MzIdentMLIndexerFactory.java:362) at uk.ac.ebi.jmzidml.xml.xxindex.MzIdentMLIndexerFactory$MzIdentMLIndexerImpl.initIdMaps(MzIdentMLIndexerFactory.java:344) at uk.ac.ebi.jmzidml.xml.xxindex.MzIdentMLIndexerFactory$MzIdentMLIndexerImpl.(MzIdentMLIndexerFactory.java:110) at uk.ac.ebi.jmzidml.xml.xxindex.MzIdentMLIndexerFactory$MzIdentMLIndexerImpl.(MzIdentMLIndexerFactory.java:66) at uk.ac.ebi.jmzidml.xml.xxindex.MzIdentMLIndexerFactory.buildIndex(MzIdentMLIndexerFactory.java:63) at uk.ac.ebi.jmzidml.xml.xxindex.MzIdentMLIndexerFactory.buildIndex(MzIdentMLIndexerFactory.java:51) at uk.ac.ebi.jmzidml.xml.io.MzIdentMLUnmarshaller.(MzIdentMLUnmarshaller.java:68) at edu.ucsd.msjava.mzid.MzIDParser.(MzIDParser.java:32) at edu.ucsd.msjava.ui.MzIDToTsv.convert(MzIDToTsv.java:125) at edu.ucsd.msjava.ui.MzIDToTsv.main(MzIDToTsv.java:84)

Java version: java 11.0.5 2019-10-15 LTS Java(TM) SE Runtime Environment 18.9 (build 11.0.5+10-LTS) Java HotSpot(TM) 64-Bit Server VM 18.9 (build 11.0.5+10-LTS, mixed mode)

Command used: /share/apps/programs/java/jdk-11.0.5/bin/java -cp /home/sophie/build/MSGFPlus_v20200805/MSGFPlus.jar edu.ucsd.msjava.ui.MzIDToTsv -showDecoy 1 -i /home/sophie/NCC_N195T196_Proteome_KU_20150109/msgf/FN12_N195T196_180min_10ug_C2_042114_filt1.mzid

FarmGeek4Life commented 3 years ago

The stand-alone converter does have the command-line option -noExtended, when that option is provided it will have the exact same columns as the output from MzIDToTsv, and generally the same contents (there are some changes to limit excessive precision, so decimal values are generally limited 5 decimal places).

I would very much recommend that if the stand-alone converter already doesn't crash, if the above option provides an acceptable output, the stand-alone converter will convert those files several times faster than MzIDToTsv (this is the reason we created it).

See also the Mzid-To-Tsv-Converter readme

SophieSWChoi commented 3 years ago

Thank you for your quick and kind response!

glormph commented 7 months ago

I ran into the same error message for MzidToTsv, and (for future reference) it seems like it crashes when using a database with e.g. custom really long fasta headers, because it worked when I shortened those (and I couldnt find any special characters in the headers that would have tripped it up).