HingeAssembler / HINGE

Software accompanying "HINGE: Long-Read Assembly Achieves Optimal Repeat Resolution"
http://genome.cshlp.org/content/27/5/747.full.pdf+html?sid=39918b0d-7a7d-4a12-b720-9238834902fd
Other
64 stars 9 forks source link

HINGE stalled and producing a huge empty error/ouput file #110

Open agroppi opened 7 years ago

agroppi commented 7 years ago

Hi, I'using HINGE to assemble a 260 Mb genome (PacBio reads RII - 56X coverage) Here is my script : HINGE_Stella_Assembly_04_2017.zip HINGE is running since 11 days, but since 4 days it seems to be stalled The last log in the output folder/log is empty. The last informative log : log_2017-04-22_15-24.txt Last but not least the ouput and error file is about 55Go ! and still growing. When I look into it : the beginning is "normal" but since HINGE is stalled, this file is growing with empty lines or spaces !!

Here is the content of the output folder :

-rw-r--r-- 1 ag users    0 Apr 22 15:50 Stella.draft.fasta
drwxr-xr-x 2 ag users 4.0K Apr 22 15:36 log
-rw-r--r-- 1 ag users    0 Apr 22 15:36 Stella.contained.txt
-rw-r--r-- 1 ag users    0 Apr 22 15:36 Stella.garbage.txt
-rw-r--r-- 1 ag users    0 Apr 22 15:36 Stella.draft.deadends.txt
-rw-r--r-- 1 ag users 5.9M Apr 22 15:36 edges.g_out.txt
-rw-r--r-- 1 ag users 4.1M Apr 22 15:36 Stella.edges.1
-rw-r--r-- 1 ag users 4.1M Apr 22 15:36 Stella.edges.2
-rw-r--r-- 1 ag users 6.0M Apr 22 15:36 Stella.edges.hinges
-rw-r--r-- 1 ag users 4.7M Apr 22 15:36 Stella.edges.hinges2
-rw-r--r-- 1 ag users 5.8M Apr 22 15:36 Stella.edges.greedy
-rw-r--r-- 1 ag users 4.8M Apr 22 15:36 Stella.edges.skipped
-rw-r--r-- 1 ag users 130M Apr 22 15:36 Stella.hgraph
-rw-r--r-- 1 ag users 212K Apr 22 15:36 Stella.debug
-rw-r--r-- 1 ag users 2.2M Apr 22 15:36 Stella.deadends.txt
-rw-r--r-- 1 ag users    0 Apr 22 15:36 hinge_debug.txt
-rw-r--r-- 1 ag users 927K Apr 22 15:36 Stella.hinge.list
-rw-r--r-- 1 ag users    0 Apr 22 15:32 overlap_debug.txt
-rw-r--r-- 1 ag users  13M Apr 22 15:32 Stella.killed.hinges
-rw-r--r-- 1 ag users  22M Apr 22 15:32 edges.bkw.backup.txt
-rw-r--r-- 1 ag users  25M Apr 22 15:32 edges.fwd.backup.txt
-rw-r--r-- 1 ag users 424K Apr 22 15:20 Stella.max
-rw-r--r-- 1 ag users 317M Apr 22 14:46 Stella.coverage.txt
-rw-r--r-- 1 ag users    0 Apr 22 13:59 Stella.filtered.fasta
-rw-r--r-- 1 ag users    0 Apr 22 13:59 Stella.homologous.txt
-rw-r--r-- 1 ag users 5.5M Apr 22 13:54 Stella.hinges.txt
-rw-r--r-- 1 ag users 6.2M Apr 22 13:54 Stella.repeat.txt
-rw-r--r-- 1 ag users    0 Apr 22 13:20 debug.txt
-rw-r--r-- 1 ag users 2.8M Apr 22 13:20 Stella.mas
-rw-r--r-- 1 ag users 2.3M Apr 22 13:20 Stella.cmas
-rw-r--r-- 1 ag users    0 Apr 22 12:34 Stella.self.flag
-rw-r--r-- 1 ag users    0 Apr 22 12:34 Stella.cov.flag
-rw-r--r-- 1 ag users  54G Apr 22 12:30 Stella.las
-rw-r--r-- 1 ag users 1.7K Apr 14 12:15 Stella.db
-rw-r--r-- 1 ag users  15G Apr 14 12:13 Stella_reads_3kb.pb.fasta
-rw-r--r-- 1 ag users 142M Apr 14 12:13 map.txt
-rw-r--r-- 1 ag users 3.8K Apr 14 12:10 HINGE_Stella_Assembly_04_2017.sh
-rw-r--r-- 1 ag users  15G Apr 14 11:54 Stella_reads_3kb.fasta

Thanks for your help

govinda-kamath commented 7 years ago

Hi,

Can you first split the .las using

LAsplit -v Stella.# 10 < Stella.las

And then run hinge with the --mlas option as

hinge filter --db Stella --las Stella --mlas -x Stella --config nominal.ini
agroppi commented 7 years ago

Hi, Thanks for your answer LAsplit -v Stella.# 10 < Stella.las worked well But the next step (HINGE filter) give a segmentation fault :

/home/ag/HINGE/inst/bin/hinge filter --db Stella --las Stella.las --mlas -x Stella --config /home/ag/HINGE/utils/nominal.ini
[2017-04-27 11:00:33.854] [log] [info] Reads filtering
[2017-04-27 11:00:33.855] [log] [info] name of db: Stella, name of .las file Stella.las
[2017-04-27 11:00:33.855] [log] [info] name of fasta: , name of .paf file 
[2017-04-27 11:00:33.855] [log] [info] Parameters passed in 

[filter]
length_threshold = 1000;
quality_threshold = 0.23;
n_iter = 3; // filter iteration
aln_threshold = 1000;
min_cov = 5;
cut_off = 300;
theta = 300;
use_qv = true;

[running]
n_proc = 12;

[draft]
min_cov = 10;
trim = 200;
edge_safe = 100;
tspace = 900;
step = 50;

[consensus]
min_length = 4000;
trim_end = 200;
best_n = 1;
quality_threshold = 0.23;

[layout]
hinge_slack = 1000
min_connected_component_size = 8

[2017-04-27 11:00:34.019] [log] [info] Las files: Stella.las
[2017-04-27 11:00:34.019] [log] [info] Calling glob.
Stella.las.1.las
-------------------------
Number of files 0
Input string Stella.las
-------------------------
[2017-04-27 11:00:34.061] [log] [info] # Reads: 1447986
[2017-04-27 11:01:46.466] [log] [info] No debug restrictions.
/home/ag/HINGE/inst/bin/hinge: line 8: 93548 Segmentation fault      Reads_filter "$@"

[EDIT] The subsequent step (hinge maximal) seems to work (file Stella.contained.txt is growing) :

home/ag/HINGE/inst/bin/hinge maximal --db Stella --las Stella.las -x Stella --config /home/ag/HINGE/utils/nominal.ini
[2017-04-27 11:01:47.462] [log] [info] Getting maximal reads
[2017-04-27 11:01:47.462] [log] [info] name of db: Stella, name of .las file Stella.las
[2017-04-27 11:01:47.462] [log] [info] name of fasta: , name of .paf file 
[2017-04-27 11:01:47.462] [log] [info] Parameters passed in 

[filter]
length_threshold = 1000;
quality_threshold = 0.23;
n_iter = 3; // filter iteration
aln_threshold = 1000;
min_cov = 5;
cut_off = 300;
theta = 300;
use_qv = true;

[running]
n_proc = 12;

[draft]
min_cov = 10;
trim = 200;
edge_safe = 100;
tspace = 900;
step = 50;

[consensus]
min_length = 4000;
trim_end = 200;
best_n = 1;
quality_threshold = 0.23;

[layout]
hinge_slack = 1000
min_connected_component_size = 8

[2017-04-27 11:01:47.617] [log] [info] Las files: Stella.las
[2017-04-27 11:01:47.617] [log] [info] # Reads: 1447986
[2017-04-27 11:03:04.415] [log] [info] No debug restrictions.
[2017-04-27 11:03:06.282] [log] [info] use_qv_mask set to true
[2017-04-27 11:03:06.282] [log] [info] use_qv_mask set to true
[2017-04-27 11:03:06.282] [log] [info] number processes set to 12
[2017-04-27 11:03:06.282] [log] [info] LENGTH_THRESHOLD = 1000
[2017-04-27 11:03:06.282] [log] [info] QUALITY_THRESHOLD = 0.23
[2017-04-27 11:03:06.282] [log] [info] N_ITER = 3
[2017-04-27 11:03:06.282] [log] [info] ALN_THRESHOLD = 1000
[2017-04-27 11:03:06.282] [log] [info] MIN_COV = 5
[2017-04-27 11:03:06.282] [log] [info] CUT_OFF = 300
[2017-04-27 11:03:06.282] [log] [info] THETA = 300
[2017-04-27 11:03:06.282] [log] [info] EST_COV = 0
[2017-04-27 11:03:06.282] [log] [info] reso = 40
[2017-04-27 11:03:06.282] [log] [info] use_coverage_mask = true
[2017-04-27 11:03:06.282] [log] [info] COVERAGE_FRACTION = 3
[2017-04-27 11:03:06.282] [log] [info] MIN_REPEAT_ANNOTATION_THRESHOLD = 10
[2017-04-27 11:03:06.282] [log] [info] MAX_REPEAT_ANNOTATION_THRESHOLD = 20
[2017-04-27 11:03:06.282] [log] [info] REPEAT_ANNOTATION_GAP_THRESHOLD = 300
[2017-04-27 11:03:06.282] [log] [info] NO_HINGE_REGION = 500
[2017-04-27 11:03:06.282] [log] [info] HINGE_MIN_SUPPORT = 7
[2017-04-27 11:03:06.282] [log] [info] HINGE_BIN_PILEUP_THRESHOLD = 7
[2017-04-27 11:03:06.282] [log] [info] HINGE_READ_UNBRIDGED_THRESHOLD = 6
[2017-04-27 11:03:06.282] [log] [info] HINGE_BIN_LENGTH = 200
[2017-04-27 11:03:06.282] [log] [info] HINGE_TOLERANCE_LENGTH = 100
[2017-04-27 11:03:06.589] [log] [info] read mask finished
[2017-04-27 11:03:07.374] [log] [info] active reads at start: 1447986
[2017-04-27 11:03:07.478] [log] [info] active reads after correcting for read lengths: 141119
[2017-04-27 11:03:07.478] [log] [info] number of las files: 1
[2017-04-27 11:03:07.478] [log] [info] name of las: Stella.las
[2017-04-27 11:03:07.495] [log] [info] Load alignments from Stella.las
[2017-04-27 11:03:07.496] [log] [info] # Alignments: 594011488
[2017-04-27 11:08:43.181] [log] [info] Input data finished, part 1/1
[2017-04-27 11:29:54.329] [log] [info] profile coverage (with and without CUT_OFF)
[2017-04-27 11:51:27.916] [log] [info] profile coverage done part 1/1
[2017-04-27 11:51:28.597] [log] [info] Estimated mean coverage: 1002
[2017-04-27 11:51:28.597] [log] [info] Estimated median coverage: 38

What is your opinion ?

Thanks

agroppi commented 7 years ago

Same bug again : the ouput and error file is about 2Go ! and still growing WITH empty lines or spaces !!

Here is the content of the output folder :-rw-r--r-- 1 ag users 0 Apr 27 12:55 Stella.draft.fasta

drwxr-xr-x 2 ag users 4.0K Apr 27 12:40 log
-rw-r--r-- 1 ag users    0 Apr 27 12:40 Stella.contained.txt
-rw-r--r-- 1 ag users    0 Apr 27 12:40 Stella.garbage.txt
-rw-r--r-- 1 ag users    0 Apr 27 12:40 Stella.draft.deadends.txt
-rw-r--r-- 1 ag users 5.9M Apr 27 12:40 edges.g_out.txt
-rw-r--r-- 1 ag users 4.1M Apr 27 12:40 Stella.edges.1
-rw-r--r-- 1 ag users 4.1M Apr 27 12:40 Stella.edges.2
-rw-r--r-- 1 ag users 6.0M Apr 27 12:40 Stella.edges.hinges
-rw-r--r-- 1 ag users 4.7M Apr 27 12:40 Stella.edges.hinges2
-rw-r--r-- 1 ag users 5.8M Apr 27 12:40 Stella.edges.greedy
-rw-r--r-- 1 ag users 4.8M Apr 27 12:40 Stella.edges.skipped
-rw-r--r-- 1 ag users 130M Apr 27 12:40 Stella.hgraph
-rw-r--r-- 1 ag users 212K Apr 27 12:40 Stella.debug
-rw-r--r-- 1 ag users 2.2M Apr 27 12:40 Stella.deadends.txt
-rw-r--r-- 1 ag users    0 Apr 27 12:40 hinge_debug.txt
-rw-r--r-- 1 ag users 927K Apr 27 12:40 Stella.hinge.list
-rw-r--r-- 1 ag users    0 Apr 27 12:37 overlap_debug.txt
-rw-r--r-- 1 ag users  13M Apr 27 12:37 Stella.killed.hinges
-rw-r--r-- 1 ag users  22M Apr 27 12:37 edges.bkw.backup.txt
-rw-r--r-- 1 ag users  25M Apr 27 12:37 edges.fwd.backup.txt
-rw-r--r-- 1 ag users 424K Apr 27 12:24 Stella.max
-rw-r--r-- 1 ag users 317M Apr 27 11:51 Stella.coverage.txt
-rw-r--r-- 1 ag users    0 Apr 27 11:03 Stella.filtered.fasta
-rw-r--r-- 1 ag users    0 Apr 27 11:03 Stella.homologous.txt
-rw-r--r-- 1 ag users 5.4G Apr 27 11:00 Stella.10.las
-rw-r--r-- 1 ag users 5.4G Apr 27 11:00 Stella.9.las
-rw-r--r-- 1 ag users 5.4G Apr 27 11:00 Stella.8.las
-rw-r--r-- 1 ag users 5.4G Apr 27 11:00 Stella.7.las
-rw-r--r-- 1 ag users 5.2G Apr 27 10:59 Stella.6.las
-rw-r--r-- 1 ag users 5.2G Apr 27 10:59 Stella.5.las
-rw-r--r-- 1 ag users 5.2G Apr 27 10:59 Stella.4.las
-rw-r--r-- 1 ag users 5.3G Apr 27 10:59 Stella.3.las
-rw-r--r-- 1 ag users 5.6G Apr 27 10:59 Stella.2.las
-rw-r--r-- 1 ag users 5.7G Apr 27 10:59 Stella.1.las
-rw-r--r-- 1 ag users 4.0K Apr 27 10:57 HINGE_Stella_Assembly_04_2017.sh
-rw-r--r-- 1 ag users 5.5M Apr 22 13:54 Stella.hinges.txt
-rw-r--r-- 1 ag users 6.2M Apr 22 13:54 Stella.repeat.txt
-rw-r--r-- 1 ag users    0 Apr 22 13:20 debug.txt
-rw-r--r-- 1 ag users 2.8M Apr 22 13:20 Stella.mas
-rw-r--r-- 1 ag users 2.3M Apr 22 13:20 Stella.cmas
-rw-r--r-- 1 ag users    0 Apr 22 12:34 Stella.self.flag
-rw-r--r-- 1 ag users    0 Apr 22 12:34 Stella.cov.flag
-rw-r--r-- 1 ag users  54G Apr 22 12:30 Stella.las
-rw-r--r-- 1 ag users 1.7K Apr 14 12:15 Stella.db
-rw-r--r-- 1 ag users  15G Apr 14 12:13 Stella_reads_3kb.pb.fasta
-rw-r--r-- 1 ag users 142M Apr 14 12:13 map.txt
-rw-r--r-- 1 ag users  15G Apr 14 11:54 Stella_reads_3kb.fasta

After so many try since 1 year with HINGE, I'm close to give up and move to another solution ...

govinda-kamath commented 7 years ago

You should have used

/home/ag/HINGE/inst/bin/hinge filter --db Stella --las Stella --mlas -x Stella --config /home/ag/HINGE/utils/nominal.ini

instead of

/home/ag/HINGE/inst/bin/hinge filter --db Stella --las Stella.las --mlas -x Stella --config /home/ag/HINGE/utils/nominal.ini

The code seg-faulted as the path to the las file was not right.

agroppi commented 7 years ago

Thanks Does that mean that in the next steps :

hinge maximal --db Stella --las Stella.las -x Stella --config /home/ag/HINGE/utils/nominal.ini

hinge layout --db Stella --las Stella.las -x Stella --config /home/ag/HINGE/utils/nominal.ini -o Stella

hinge draft --db Stella --las Stella.las --prefix Stella --config /home/ag/HINGE/utils/nominal.ini --out Stella.draft I have to correct also --las Stella.las to --las Stella ? and had the --mlas option ?

If yes, it would be very usefull to correct the example you give here https://github.com/HingeAssembler/HINGE

govinda-kamath commented 7 years ago

You should run

hinge maximal --db Stella --las Stella --mlas -x Stella --config /home/ag/HINGE/utils/nominal.ini

hinge layout --db Stella --las Stella --mlas -x Stella --config /home/ag/HINGE/utils/nominal.ini -o Stella

The draft should be as before.

Thanks. We'll update the example.