bzhanglab / PepQuery

PepQuery: a targeted peptide search engine
http://pepquery.org
GNU General Public License v3.0
8 stars 0 forks source link

Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler #64

Open anu80125 opened 1 week ago

anu80125 commented 1 week ago

Hi PepQuery users,

I encountered the following error while running the PepQuery in command line with following command -

java -jar pepquery-2.0.2/pepquery-2.0.2.jar -b Academia_Sinica_LUAD100_Phosphoproteome_PDC000220 -db gencode:human -hc -o pepquery_out/ -i protein.fasta -t protein

2024-06-25 16:43:50 [INFO ] main.java.pg.SpectraInput[readSpectraFromMSMSlibrary:371] - Used CPUs: 32 2024-06-25 16:43:55 [WARN ] main.java.msio.MsLibrarySearchWorker[loadSpectra:201] - File doesn't exist:pepquery_out//Academia_Sinica_LUAD100_Phosphoproteome_PDC000220/index/50945.mgf 2024-06-25 16:44:45 [INFO ] main.java.pg.SpectraInput[readSpectraFromMSMSlibrary:393] - Matched spectra: 4495 2024-06-25 16:44:45 [INFO ] main.java.pg.SpectraInput[readSpectraFromMSMSlibrary:406] - Delete downloaded MS/MS index files. 2024-06-25 16:44:45 [INFO ] main.java.pg.PeptideSearchMT[search:456] - Time elapsed: 1.68 min 2024-06-25 16:44:45 [INFO ] main.java.pg.PeptideSearchMT[search:476] - Step 2: candidate spectra retrieval and PSM scoring done: time elapsed = 1.68 min 2024-06-25 16:44:45 [INFO ] main.java.pg.PeptideSearchMT[search:480] - Step 3-4: competitive filtering based on reference sequences and statistical evaluation ... 2024-06-25 16:44:45 [INFO ] main.java.pg.PeptideSearchMT[search:515] - Don't find indexed database:pepquery_out//database/gencode.v46.pc_translations_format.fasta.sqldb 2024-06-25 16:44:45 [INFO ] main.java.pg.PeptideSearchMT[search:522] - Use database:pepquery_out//database/gencode.v46.pc_translations_format.fasta 2024-06-25 16:44:45 [INFO ] main.java.pg.DatabaseInput[getEnzymeByIndex:263] - Use enzyme:Trypsin 2024-06-25 16:44:49 [INFO ] main.java.pg.DatabaseInput[protein_digest:474] - Protein sequences:111868, total unique peptide sequences:1758082 2024-06-25 16:44:49 [INFO ] main.java.pg.DatabaseInput[protein_digest:475] - Time used for protein digestion:4 s. Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "pool-6-thread-14" Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "pool-6-thread-15" Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "pool-6-thread-10" Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "pool-6-thread-13" Exception in thread "pool-6-thread-10" Exception in thread "pool-6-thread-13" Exception in thread "pool-6-thread-19" Exception in thread "pool-6-thread-5" Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "pool-6-thread-11" Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "pool-6-thread-16" Exception in thread "pool-6-thread-6" java.lang.OutOfMemoryError: Java heap space at java.base/java.util.Arrays.copyOf(Arrays.java:3689) Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "pool-6-thread-5" Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "pool-6-thread-19" Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "pool-6-thread-8" Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "pool-6-thread-29" Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "pool-6-thread-24"

The protein is made of 251 amino acids. Any help to figure this out would be highly appreciated.

Thanks in advance Anu

wenbostar commented 1 week ago

The error message indicates it is an out of memory issue. Could you please try the following command line?

java -Xmx20G -jar pepquery-2.0.2/pepquery-2.0.2.jar -b Academia_Sinica_LUAD100_Phosphoproteome_PDC000220 -db gencode:human -hc -o pepquery_out/ -i protein.fasta -t protein

anu80125 commented 1 week ago

Thank you for your prompt reply. It worked well.

However, when I am trying to read the input sequence from a file, I'm experiencing the following error. java -Xmx20G -jar pepquery-2.0.2/pepquery-2.0.2.jar -b CPTAC_LUAD_Discovery_Study_Proteome_PDC000153,CPTAC_Prospective_Breast_BI_Proteome_PDC000120,CPTAC_TCGA_Ovarian_Proteome_PDC000113_P DC000114,CPTAC_LUAD_Discovery_Study_Acetylome_PDC000224,CPTAC_LSCC_Discovery_Study_Phosphoproteome_PDC000232,CPTAC_TCGA_Breast_Cancer _Proteome_PDC000173,CPTAC_Pediatric_Brain_Cancer_Pilot_Study_Proteome_PDC000180,CPTAC_LSCC_Discovery_Study_Proteome_PDC000234,CPTAC_P rospective_Ovarian_JHU_Proteome_PDC000110,CPTAC_CCRCC_Discovery_Study_Phosphoproteme_PDC000128 -db gencode:human -hc -o pepquery2_out / -i protein_AA.txt -t protein -fast > logfile.log

net.sf.kerner.utils.exception.ExceptionFileFormat: failed to get header from MFEREYTGLPGVCWEGSIIRQVRSTQMETSVSVSLWMPPSQRVFTF
at net.sf.jfasta.impl.FASTAElementHeaderReader.read(FASTAElementHeaderReader.java:76) at net.sf.jfasta.impl.FASTAElementIterator.doRead(FASTAElementIterator.java:104) at net.sf.jfasta.impl.FASTAElementIterator.doRead(FASTAElementIterator.java:50) at net.sf.kerner.utils.io.buffered.AbstractIOIterator.peek(AbstractIOIterator.java:103) at net.sf.kerner.utils.io.buffered.AbstractIOIterator.hasNext(AbstractIOIterator.java:109) at main.java.pg.InputProcessor.digest(InputProcessor.java:329) at main.java.pg.InputProcessor.run(InputProcessor.java:155) at main.java.pg.PeptideSearchMT.search(PeptideSearchMT.java:388) at main.java.pg.PeptideSearchMT.search_multiple_datasets(PeptideSearchMT.java:738) at main.java.pg.PeptideSearchMT.main(PeptideSearchMT.java:183) at main.java.pg.IMain.main(IMain.java:30)

My 'protein_AA.txt' input file have the format as below

MFEREYTGLPGVCWEGSIIRQVRSTQMETSVSVSLWMPPSQRVFTF

Any help would be highly appreciated.

Thanks in advance Anu

wenbostar commented 1 week ago

Could you change "-i protein_AA.txt" to "-i MFEREYTGLPGVCWEGSIIRQVRSTQMETSVSVSLWMPPSQRVFTF"?

anu80125 commented 4 days ago

Thanks for the reply.

-i MFEREYTGLPGVCWEGSIIRQVRSTQMETSVSVSLWMPPSQRVFTF works fine. However, I was trying to make an input file as mentioned in the documentation for the searching multiple peptides at a time.

Thanking you once again. Anu

wenbostar commented 3 days ago

If your input sequences are a list of peptide sequences, you can put them to a txt file in which each row is a peptide sequence. If your input sequences are a list of proteins, you can put them to a fasta format file.