AlgoLab / PIntron

A novel pipeline for gene-structure prediction based on spliced alignment of transcript sequences (ESTs and mRNAs) against a genomic sequence
http://www.algolab.eu/PIntron
Other
5 stars 6 forks source link

Could not compute the factorizations #43

Open ramadatta opened 8 years ago

ramadatta commented 8 years ago

Hi,

I experience the following error: I have 16 gb ram, around 200K transcripts aligning on 24 chromosomes. May i please know, if it is the issue of memory. Thanks.

I am using pintron version: 1.3.3

[INFO ] 2016-03-14 14:09:51,054 - PIntron_%PINTRONVERSION% [INFO ] 2016-03-14 14:09:51,054 - Copyright (C) 2010,2011 Paola Bonizzoni, Gianluca Della Vedova, Yuri Pirola, Raffaella Rizzi. [INFO ] 2016-03-14 14:09:51,055 - This program is distributed under the terms of the GNU Affero General Public License (AGPL), either version 3 of the License, or (at your option) any later version. [INFO ] 2016-03-14 14:09:51,055 - This program comes with ABSOLUTELY NO WARRANTY. See the GNU Affero General Public License for more details. [INFO ] 2016-03-14 14:09:51,055 - This is free software, and you are welcome to redistribute it under the conditions specified by the license. [INFO ] 2016-03-14 14:09:51,055 - Running: /home/prakkisr/bin/pintron -b /home/prakkisr/Documents/PIntron/bin -g /store1/home/prakkisr/NEW/Tilapia/6MTil_CollapsedIso_withMTil_Genome_Q1527_v2scaffolds_63ContaminantRemoved/SW_FW/Q1527_v2_scaffolds_63ContaminantRemoved.fasta -s sample_trans.fasta -n MozTilapia [INFO ] 2016-03-14 14:09:51,055 - STEP 1: Checking executables and preparing input data... [INFO ] 2016-03-14 14:10:04,712 - STEP 2: Pre-aligning transcript data... Could not compute the factorizations 139 [ERROR ] 2016-03-14 14:10:14,181 - * Fatal error caught during the execution of the pipeline! * Could not compute the factorizations Traceback (most recent call last): File "/home/prakkisr/bin/pintron", line 1017, in pintron_pipeline(options) File "/home/prakkisr/bin/pintron", line 884, in pintron_pipeline output_file='raw-multifasta-out.txt') File "/home/prakkisr/bin/pintron", line 773, in exec_system_command raise PIntronError(error_comment) PIntronError: Could not compute the factorizations

gdv commented 8 years ago

Thanks for your report. It is likely to be due to memory usage. Can you attach the logfiles produced by PIntron, so that we can further investigate the issue?

ramadatta commented 8 years ago

Hi, this is the content of my log file.

INFO:root:20160314-141507994:PIntron_%PINTRONVERSION% INFO:root:20160314-141507994:Copyright (C) 2010,2011 Paola Bonizzoni, Gianluca Della Vedova, Yuri Pirola, Raffaella Rizzi. INFO:root:20160314-141507994:This program is distributed under the terms of the GNU Affero General Public License (AGPL), either version 3 of the License, or (at your option) any later version. INFO:root:20160314-141507995:This program comes with ABSOLUTELY NO WARRANTY. See the GNU Affero General Public License for more details. INFO:root:20160314-141507995:This is free software, and you are welcome to redistribute it under the conditions specified by the license. INFO:root:20160314-141507995:Running: /home/prakkisr/bin/pintron INFO:root:20160314-141507995:STEP 1: Checking executables and preparing input data... DEBUG:root:20160314-141507995:Using main program 'pintron' in dir '/home/prakkisr/Documents/PIntron/bin/pintron' (md5: 350bdf1530059e14185df9bcdc776808) DEBUG:root:20160314-141507999:Using program 'est-fact' in dir '/home/prakkisr/Documents/PIntron/bin/est-fact' (md5: ba03671565959d0c6a02bcd3d5aba8e1) DEBUG:root:20160314-1415082:Using program 'min-factorization' in dir '/home/prakkisr/Documents/PIntron/bin/min-factorization' (md5: 109b7da5108938b99eac04026ce5f377) DEBUG:root:20160314-1415084:Using program 'intron-agreement' in dir '/home/prakkisr/Documents/PIntron/bin/intron-agreement' (md5: 53e20167291da8749a55f3d09e631dfd) DEBUG:root:20160314-1415085:Using program 'compact-compositions' in dir '/home/prakkisr/Documents/PIntron/bin/compact-compositions' (md5: dc66e8fca806ecd8cf5ef0697dae04e9) DEBUG:root:20160314-1415085:Using program 'maximal-transcripts' in dir '/home/prakkisr/Documents/PIntron/bin/maximal-transcripts' (md5: a2944fa0836d80c72b6adf7f28f8b02a) DEBUG:root:20160314-1415086:Using program 'cds-annotation' in dir '/home/prakkisr/Documents/PIntron/bin/cds-annotation' (md5: b1aa10667d40b5cf55679d1525fdeeed) DEBUG:root:20160314-1415086:Files "genomic.txt" and "genomic.txt" refer to the same file: skip copy. DEBUG:root:20160314-1415086:Files "ests.txt" and "ests.txt" refer to the same file: skip copy. INFO:root:20160314-1415086:STEP 2: Pre-aligning transcript data... DEBUG:root:20160314-1415087:time.struct_time(tm_year=2016, tm_mon=3, tm_mday=14, tm_hour=14, tm_min=15, tm_sec=8, tm_wday=0, tm_yday=74, tm_isdst=0) DEBUG:root:20160314-1415087:ulimit -t 3600 && ulimit -v 3072000 && /home/prakkisr/Documents/PIntron/bin/est-fact ERROR:root:20160314-141516327:* Fatal error caught during the execution of the pipeline! * Could not compute the factorizations Traceback (most recent call last): File "/home/prakkisr/bin/pintron", line 1017, in pintron_pipeline(options) File "/home/prakkisr/bin/pintron", line 884, in pintron_pipeline output_file='raw-multifasta-out.txt') File "/home/prakkisr/bin/pintron", line 773, in exec_system_command raise PIntronError(error_comment) PIntronError: Could not compute the factorizations

rrizzi commented 8 years ago

Hi, in order to understand your problem we need also the input files you used with PIntron (that is, the genomic file and the file of transcripts). Please note that the input genomic sequence must be restricted to the gene locus and the input transcripts must originate from the gene you want to process.

ramadatta commented 8 years ago

I was using genome with 24 chromosomes (size around 550 Mb) and transcripts file was around 700 Mb in size. So, I need to predict the gene locus first and then align is it?

gdv commented 8 years ago

Yes, it would be much better to split the genome and the transcripts set into parts. We are implementing an automatic procedure for that step, but it's not usable yet.

Gianluca Della Vedova http://gianluca.dellavedova.org Il 17/mar/2016 05:15, "Prakki Rama" notifications@github.com ha scritto:

I was using genome with 24 chromosomes (size around 550 Mb) and transcripts file was around 700 Mb in size. So, I need to predict the gene locus first and then align is it?

— You are receiving this because you commented. Reply to this email directly or view it on GitHub https://github.com/AlgoLab/PIntron/issues/43#issuecomment-197685761