compomics / searchgui

Highly adaptable common interface for proteomics search and de novo engines
http://compomics.github.io/projects/searchgui.html
40 stars 16 forks source link

FastaCLI Error Header #286

Closed KunathBJ closed 3 years ago

KunathBJ commented 3 years ago

Hello,

I have an issue while generating the parameter file for some of my searches. The database is missing. Digging more into it, it seems like the database is not produced by fastaCLI because of some error header. That should normally be an easy one since I usually make my header very simple to avoid this kind of issues but really here I don't know what it can be:

$ java -cp SearchGUI-3.3.20/SearchGUI-3.3.20.jar eu.isas.searchgui.cmd.FastaCLI -in DB_LAO/D01/D01.fasta -decoy
Reindexing: D01.fasta.
10% 20% 30% 40% 50% 60% 70% 80%2021-03-08 15:01:24,950 ERROR Header -  * Unable to process FASTA header line:
        OEMCKOFG_00004

Mon Mar 08 15:01:24 CET 2021 An error occurred while running the command line. Please see the SearchGUI log file.

I didn't produce the file for the database but I can look at everything I don't find any weird symbol around that header. I even deleted the previous and next line and brought them back manually (to see if there were a weird character I couldn't see) but nothing change.

It is not the first line of the file but I have this issues for all the file I received. So I'm pretty sure that something in the files but I don't know what to look for anymore.

Do you guys have any suggestions??

Thanks a lot and take care, Ben

KunathBJ commented 3 years ago

the SearchGUI log just mention that the parameter file is missing:


Mon Mar 08 14:43:26 CET 2021: SearchGUI version 3.3.20.
Memory given to the Java virtual machine: 30542397440.
Total amount of memory in the Java virtual machine: 2024275968.
Free memory: 2003136632.
Java version: 1.8.0_162.
java.io.FileNotFoundException: /scratch/users/bkunath/IMP_MetaP/OUT_LAO/D01.out/D01_params.par (No such file or directory)
        at java.io.FileInputStream.open0(Native Method)
        at java.io.FileInputStream.open(FileInputStream.java:195)
        at java.io.FileInputStream.<init>(FileInputStream.java:138)
        at com.compomics.util.io.json.JsonMarshaller.getJsonStringFromFile(JsonMarshaller.java:196)
        at com.compomics.util.io.json.JsonMarshaller.fromJson(JsonMarshaller.java:121)
        at com.compomics.util.preferences.IdentificationParameters.getIdentificationParameters(IdentificationParameters.java:401)
        at com.compomics.cli.identification_parameters.IdentificationParametersInputBean.isValidStartup(IdentificationParametersInputBean.java:170)
        at eu.isas.searchgui.cmd.SearchCLIInputBean.isValidStartup(SearchCLIInputBean.java:992)
        at eu.isas.searchgui.cmd.SearchCLI.<init>(SearchCLI.java:89)
        at eu.isas.searchgui.cmd.SearchCLI.main(SearchCLI.java:378)
java.io.FileNotFoundException: /scratch/users/bkunath/IMP_MetaP/OUT_LAO/D01.out/D01_params.par (No such file or directory)
        at java.io.FileInputStream.open0(Native Method)
        at java.io.FileInputStream.open(FileInputStream.java:195)
        at java.io.FileInputStream.<init>(FileInputStream.java:138)
        at com.compomics.util.io.SerializationUtils.readObject(SerializationUtils.java:52)
        at com.compomics.util.preferences.IdentificationParameters.getIdentificationParameters(IdentificationParameters.java:415)
        at com.compomics.cli.identification_parameters.IdentificationParametersInputBean.isValidStartup(IdentificationParametersInputBean.java:170)
        at eu.isas.searchgui.cmd.SearchCLIInputBean.isValidStartup(SearchCLIInputBean.java:992)
        at eu.isas.searchgui.cmd.SearchCLI.<init>(SearchCLI.java:89)
        at eu.isas.searchgui.cmd.SearchCLI.main(SearchCLI.java:378)
java.lang.IllegalArgumentException: Parameters file /scratch/users/bkunath/IMP_MetaP/OUT_LAO/D01.out/D01_params.par not recognized.
        at com.compomics.util.preferences.IdentificationParameters.getIdentificationParameters(IdentificationParameters.java:419)
        at com.compomics.cli.identification_parameters.IdentificationParametersInputBean.isValidStartup(IdentificationParametersInputBean.java:170)
        at eu.isas.searchgui.cmd.SearchCLIInputBean.isValidStartup(SearchCLIInputBean.java:992)
        at eu.isas.searchgui.cmd.SearchCLI.<init>(SearchCLI.java:89)
        at eu.isas.searchgui.cmd.SearchCLI.main(SearchCLI.java:378)
SearchGUI.log (END)
hbarsnes commented 3 years ago

Hi Ben,

Can you share the complete header causing the issue?

The log file errors are all related to the SearchCLI command line and are thus not relevant when running the FastaCLI command line.

Best regards, Harald

KunathBJ commented 3 years ago

Hello Harald,

This is the entire header. I copied the actual sequence as well as the previous and next one:

>ELBKLAIN_16665
MNLPELSVKRHVLAYMLSGVLVLFGLISFQRIGVDRYPNIDFPMISITTALPGGDPEIVNSSITKVIESAVNSVPGIEHVESTSASGVSLISVRFAMEKDLGAAFNEVQAKVNQVLNRLPREAKPPIVAKVEIGATPIIWLALQGDRTPQQLNQYARNVIKKRLETVNGVGEVRLGGDRERTLRVSLDPERMAGQGITIQDVTRAFDAEHVRLPGGFLVGGKHEDLIKLDLEFHSAPELEKLIVAYRGGAPIR
LADIATVEDGLADNRQLARYMGKPAVGIGVVKVSGSNAVAVANEVERRLEREIIPQLPAGMTLSVASNDASLIREIVAALEEHLLTGTLFTALVVWLFLLNLRSTLIVAMAIPVSLLGAVAVMYFAGFTFNTMTLLGLLLLIGVVVDDAIVVLENIYRHRVHLDSDPVSAALNGTREVEFAVMAASLTLVAIFAPVIFMGGIIGRFFQAFAVVVSAGVLVSLFVALTLTPMLCARHLRVTPSTGRVSSFLERG
FHAMDTVYRVLLDRALRFRWTVVAITVAVVVGSGWFFAHLGGGFMPVQDEGRFLVNFKTPLGTGIEYADARLRDIEAVLARHPEIVGTFSTIGTDQTGQVNKGFADITMAPWNKRTITQQELIDVLRVELATIPGVEAFPGPRSTVGGQRGEPLQFVLAGPDLNQVGQLANALNKELASDPSLGRVDLELQLDLPQLETALSRERVTSLGLSARDVAQAVNILAGGLDIARFNDRTGDAERYDIRLKAADGSL
QHPEDLARIYLRAGNGEMVRLDNLIKMERRLGPAVISGFDLQYAAKFYSAPKVSMSDAVGRVQRIAAPLLTPGYTVQMIGQAEEFGKTMSYMLFAFVTAIVLVYMVLASQFNSFLQPLVVMVAQPLAVVGGGAARFGPGEREGGERLSLPGGSPRFPHSQGGDQRRIEAGGEEDADRHIRDQMGA
>OEMCKOFG_00004
MLKYHHDQQLFPTGFFHSPFKVAPLLKHPHLQTLFASVHRRTPPKIEREQQRLSLPDGDFLILDYKTPSPIAHHAPLVLVIHGLSGSSDSHYVIGLQNALAAQGWPSVAMNCRGATEPNTSIRAYHAGASDDVIAVFNHLCKNQNRDIVIVGYSLGGSMTLKALSELGQHPRLLAGVSVSAPLELAPCAYRLDKGFSMVYRQHLLDKLQQLWQDKYQHLLSLGQTEQAQQIADCLQHAPFKSFWDFDDRLMAPLHGFTNVDDYYQRCRPNQFLKSIQVPTLIIHALDDPFMSVDVVPAQHDLSPLIHFELAKQGGHV
>OEMCKOFG_00005
MHSTVRSGHSGARPRRLLAPPLSTTIDTSGEVFQKNRNDMLEQLAEIDALLDEAAAGGGEKRTERLRSRGKLPIRERIANAIDPGTPFLEISALAAYDSDYTIGGGMVVGIGVIGGVECVIMGNDPTVLAGALTPYAGKKWMRALQIARDNRLPYVSFVESAGADLRMGGAGQKAPYQTDHFAETGRFFYELIELSKLGIPTVCVVFGSSTAGGAYQPGLSDYTIVVKDQSKIFLAGPPLVKMATGEETDDEPLGGAEV
hbarsnes commented 3 years ago

Here's the error I get when trying to parse those three sequences in the graphical user interface:

FASTA error

Seems like you are very unlucky in that your header is assumed to be a specific type of header. This happens due to the header starting with "OE". I will try to make the parsing more clever to avoid such issues.

In the meantime you can perhaps either make minor adjustments to the affected headers or replace them with our recommended non-standard FASTA format: https://github.com/compomics/searchgui/wiki/DatabaseHelp#non-standard-fasta?

KunathBJ commented 3 years ago

AAAAh Very unlucky indeed. With all the databases I've generated with Prokka (that makes that kind of random header) I've never encountered such issue ^^' Thanks for the help. I was running out of ideas. I'll just make sure the databases don't have this starting header anymore.

Thanks again and take care, Ben

hbarsnes commented 3 years ago

Actually, all you now have to do is update to the latest SearchGUI (and PeptideShaker) and the parsing of these particular headers should be ok. If this is not the case, please let me know and I'll reopen the issue.