Closed ttessie2 closed 3 years ago
This looks similar to an issue we've seen before which was related to non-unique FASTA accession numbers. Which version of SearchGUI are you using? I would recommend updating to the latest SearchGUI version and re-index the FASTA file. You should then get a warning if the file contains non-unique accession numbers. If you can share the FASTA file I can also take a look.
It would be great if you can also share the PeptideShaker error log. You can find it via the Welcome dialog > Settings & Help > Help > Bug Report.
I won't be at my desktop until later this evening so I can't send you anything at the moment. However, when I was running ms-gf my searchgui and peptide shaker were the most up-to-date versions. This evening I will repeat everything from the beginning and take more detailed notes. But off the top of my head, which database file is the correct one to use when starting a peptide shaker project? I only tried with the fasta file generated by searchgui that is concatenated with the decoys.
It depends more on when the FASTA file was indexed, as this test was only added (or rather re-added) in SearchGUI 4.0.25. If your FASTA file was indexed before that it will not be re-indexed. To get a proper test I would recommend that you rename your FASTA file (or just make a copy) and then re-add the decoys in the latest SearchGUI version.
But off the top of my head, which database file is the correct one to use when starting a peptide shaker project? I only tried with the fasta file generated by searchgui that is concatenated with the decoys.
Yes, that is the correct approach.
I copied the fasta file into a new directory and repeated everything without any luck. Below is the report from peptide shaker
Thu Apr 15 22:49:25 EDT 2021: PeptideShaker version 2.0.19. Memory given to the Java virtual machine: 17179869184. Total amount of memory in the Java virtual machine: 134217728. Free memory: 87379832. Java version: 15.0.2. 1714 script command tokens (C) 2009 Jmol Development Jmol Version: 12.0.43 2011-05-03 14:21 java.vendor: Oracle Corporation java.version: 15.0.2 os.name: Windows 10 memory: 49.1/134.2 processors available: 24 useCommandThread: false
PeptideShaker processing failed. See the PeptideShaker log for details. java.lang.ArrayIndexOutOfBoundsException: Index 12534 out of bounds for length 12534 at com.compomics.util.experiment.identification.protein_inference.fm_index.FMIndex.recursiveMassFilling(FMIndex.java:1579) at com.compomics.util.experiment.identification.protein_inference.fm_index.FMIndex.recursiveMassFilling(FMIndex.java:1605) at com.compomics.util.experiment.identification.protein_inference.fm_index.FMIndex.recursiveMassFilling(FMIndex.java:1605) at com.compomics.util.experiment.identification.protein_inference.fm_index.FMIndex.recursiveMassFilling(FMIndex.java:1605) at com.compomics.util.experiment.identification.protein_inference.fm_index.FMIndex.recursiveMassFilling(FMIndex.java:1605) at com.compomics.util.experiment.identification.protein_inference.fm_index.FMIndex.recursiveMassFilling(FMIndex.java:1605) at com.compomics.util.experiment.identification.protein_inference.fm_index.FMIndex.recursiveMassFilling(FMIndex.java:1605) at com.compomics.util.experiment.identification.protein_inference.fm_index.FMIndex.init(FMIndex.java:1218) at com.compomics.util.experiment.identification.protein_inference.fm_index.FMIndex.
(FMIndex.java:581) at eu.isas.peptideshaker.fileimport.FileImporter.importSequences(FileImporter.java:957) at eu.isas.peptideshaker.fileimport.FileImporter.importFiles(FileImporter.java:219) at eu.isas.peptideshaker.PeptideShaker.importFiles(PeptideShaker.java:219) at eu.isas.peptideshaker.gui.NewDialog$20.run(NewDialog.java:736) at java.base/java.lang.Thread.run(Thread.java:832) Free memory: 248209744 Thu Apr 15 23:13:30 EDT 2021: PeptideShaker version 2.0.19. Memory given to the Java virtual machine: 17179869184. Total amount of memory in the Java virtual machine: 134217728. Free memory: 121514768. Java version: 15.0.2.
Thu Apr 15 23:18:42 EDT 2021: PeptideShaker version 2.0.19. Memory given to the Java virtual machine: 17179869184. Total amount of memory in the Java virtual machine: 134217728. Free memory: 87284176. Java version: 15.0.2. 1714 script command tokens (C) 2009 Jmol Development Jmol Version: 12.0.43 2011-05-03 14:21 java.vendor: Oracle Corporation java.version: 15.0.2 os.name: Windows 10 memory: 50.5/134.2 processors available: 24 useCommandThread: false
Could you try renaming the FASTA file instead?
I tried renaming the fasta file as well as downloading a new fasta file of the human proteome but I'm still getting the same error. I then tried using IDPicker with the mzid file and didn't have any issues opening it. I'm not exactly sure what to do next.
Would it be possible for you to share the data with me so that I can try to reproduce it on my side?
I can upload those when I get back to my desktop at home. In the meantime I have downloaded SearchGUI onto a lab computer to see if I get the same error. Interestingly, I'm getting a different issue that has to due with importing the protein database. I get this error when I'm entering the input information into SearchGUI. When I upload the fasta file and get the message "the database does not seem to contain decoy sequences. Add decoys?" and I say yes, I get a "FASTA import error" telling me that the fasta file cannot be found.
Nevermind, ignore that last message please. That was due to admin privileges on this computer I moved the files elsewhere and there isn't an issue loading the database file.
Update: I ran searchGUI/MSGF here (on a different desktop) and tried opening it in peptideshaker and had the same problem. Here is a link for the database file and raw spectra. https://drive.google.com/drive/folders/1m9WhsM-hbcb4N26poRwLj5cVX3DgAblg?usp=sharing
Thanks for sharing the data! I will process the data and see if I can reproduce the issue.
BTW, I see from your search settings that you have set the fragment mass tolerance to 2.5 Da? This seems very high? Are you sure that this is correct?
That should be 0.5 not 2.5. Thanks for the catch!
Can you try again with 0.5 and see if that solves the problem?
Oh wow, yea that solved it! What a stupid mistake, thanks for pointing that out!
I'm just testing different search parameters with MS-GF+ in searchGUI however, I get the above error while trying to create a project in peptide shaker. I did not select the run peptideshaker option in searchGUI after running MS-GF+. I've chosen the correct .mzid files and database. Any help would be great. The error says: An error occurred while loading the identification files index 12534 out of bounds for length 12534