HingeAssembler / HINGE

Software accompanying "HINGE: Long-Read Assembly Achieves Optimal Repeat Resolution"
http://genome.cshlp.org/content/27/5/747.full.pdf+html?sid=39918b0d-7a7d-4a12-b720-9238834902fd
Other
64 stars 9 forks source link

'DiGraph' object has no attribute 'edge' #132

Open asdcid opened 6 years ago

asdcid commented 6 years ago

I am trying to use this to assembly the nanopore reads, but when I run the hinge clip step, I got this error message.

"' raymond:$hinge clip-nanopore ecoli.edges.hinges ecoli.hinge.list t1 /opt/python/lib/python2.7/site-packages/matplotlib/font_manager.py:273: UserWarning: Matplotlib is building the font cache using fc-list. This may take a moment. warnings.warn('Matplotlib is building the font cache using fc-list. This may take a moment.') Traceback (most recent call last): File "/home/raymond/devel/hinge/HINGE//inst/bin/../lib/hinge/pruning_and_clipping_nanopore.py", line 1202, in mark_skipped_edges(G,flname.split('.')[0] + '.edges.skipped') File "/home/raymond/devel/hinge/HINGE//inst/bin/../lib/hinge/pruning_and_clipping_nanopore.py", line 942, in mark_skippededges G.edge[lines1[0] + "" + lines1[3]][lines1[1] + "_" + lines1[4]]['skipped'] = 1 AttributeError: 'DiGraph' object has no attribute 'edge' "'

I am not sure whether it related to the HPC.dalinger step. When I run the HPC.dalinger step, I only get the ecoli.las file.

"' raymond:$ HPC.daligner -t5 ecoli | csh -v

daligner -t5 ecoli ecoli

LAsort ecoli.ecoli.C0 ecoli.ecoli.N0 ecoli.ecoli.C1 ecoli.ecoli.N1 ecoli.ecoli.C2 ecoli.ecoli.N2 ecoli.ecoli.C3 ecoli.ecoli.N3 && LAmerge ecoli ecoli.ecoli.C0.S ecoli.ecoli.N0.S ecoli.ecoli.C1.S ecoli.ecoli.N1.S ecoli.ecoli.C2.S ecoli.ecoli.N2.S ecoli.ecoli.C3.S ecoli.ecoli.N3.S

LAcheck -vS ecoli ecoli ecoli: 207,210 all OK

rm ecoli.ecoli.C0.las ecoli.ecoli.N0.las ecoli.ecoli.C1.las ecoli.ecoli.N1.las ecoli.ecoli.C2.las ecoli.ecoli.N2.las ecoli.ecoli.C3.las ecoli.ecoli.N3.las rm ecoli.ecoli.C0.S.las ecoli.ecoli.N0.S.las ecoli.ecoli.C1.S.las ecoli.ecoli.N1.S.las ecoli.ecoli.C2.S.las ecoli.ecoli.N2.S.las ecoli.ecoli.C3.S.las ecoli.ecoli.N3.S.las raymond:$ ls ecoli.db ecoli.las "'

alimayy commented 6 years ago

I get the same error with hinge clip

Edit: I cannot reproduce the problem with the HINGE version I built on 2017-08-29. It can only reproduce it with my most recent version I built on 2017-09-28.

Executing the command hinge layout --db hinge --las hinge.las -x hinge --config /pipeline/HINGE/utils/nominal.ini -o hinge

[2017-10-02 12:04:04.273] [log] [info] Hinging layout
[2017-10-02 12:04:04.274] [log] [info] name of db: hinge, name of .las file hinge.las
[2017-10-02 12:04:04.274] [log] [info] name of fasta: , name of .paf file
[2017-10-02 12:04:04.274] [log] [info] filter files prefix: hinge
[2017-10-02 12:04:04.274] [log] [info] output prefix: hinge
[2017-10-02 12:04:04.274] [log] [info] Multiple las files: false
[2017-10-02 12:04:04.274] [log] [info] Multiple las files: false
[2017-10-02 12:04:04.274] [log] [info] Parameters passed in
[filter]
length_threshold = 1000;
quality_threshold = 0.23;
n_iter = 3; // filter iteration
aln_threshold = 1000;
min_cov = 5;
cut_off = 300;
theta = 300;
use_qv = true;

[running]
n_proc = 12;

[draft]
min_cov = 10;
trim = 200;
edge_safe = 100;
tspace = 900;
step = 50;

[consensus]
min_length = 4000;
trim_end = 200;
best_n = 1;
quality_threshold = 0.23;

[layout]
hinge_slack = 1000
min_connected_component_size = 8

[2017-10-02 12:04:04.279] [log] [info] # Reads: 86719
[2017-10-02 12:04:05.650] [log] [info] Input data finished
[2017-10-02 12:04:05.650] [log] [info] LENGTH_THRESHOLD = 1000
[2017-10-02 12:04:05.650] [log] [info] QUALITY_THRESHOLD = 0.23
[2017-10-02 12:04:05.650] [log] [info] ALN_THRESHOLD = 1000
[2017-10-02 12:04:05.650] [log] [info] MIN_COV = 5
[2017-10-02 12:04:05.650] [log] [info] CUT_OFF = 300
[2017-10-02 12:04:05.650] [log] [info] THETA = 300
[2017-10-02 12:04:05.650] [log] [info] N_ITER = 3
[2017-10-02 12:04:05.650] [log] [info] THETA2 = 0
[2017-10-02 12:04:05.650] [log] [info] N_PROC = 12
[2017-10-02 12:04:05.650] [log] [info] HINGE_SLACK = 1000
[2017-10-02 12:04:05.650] [log] [info] HINGE_TOLERANCE = 150
[2017-10-02 12:04:05.650] [log] [info] KILL_HINGE_OVERLAP_ALLOWANCE = 300
[2017-10-02 12:04:05.650] [log] [info] KILL_HINGE_INTERNAL_ALLOWANCE = 40
[2017-10-02 12:04:05.650] [log] [info] MATCHING_HINGE_SLACK = 200
[2017-10-02 12:04:05.650] [log] [info] MIN_CONNECTED_COMPONENT_SIZE = 8
[2017-10-02 12:04:05.650] [log] [info] USE_TWO_MATCHES = true
[2017-10-02 12:04:05.650] [log] [info] del_telomeres = false
[2017-10-02 12:04:05.675] [log] [info] read mask finished
[2017-10-02 12:04:05.776] [log] [info] read marked repeats
[2017-10-02 12:04:05.776] [log] [info] killed 0 reads with many repeats
[2017-10-02 12:04:05.876] [log] [info] read marked hinges
[2017-10-02 12:04:05.878] [log] [info] active reads: 86719
[2017-10-02 12:04:05.916] [log] [info] active reads: 46159
[2017-10-02 12:04:06.113] [log] [info] Multiple las files: false
[2017-10-02 12:04:06.113] [log] [info] Las files: hinge.las
[2017-10-02 12:04:06.113] [log] [info] number of las files: 1
[2017-10-02 12:04:06.114] [log] [info] Total number of active reads: 4328/86719
[2017-10-02 12:04:06.120] [log] [info] name of las: hinge.las
[2017-10-02 12:04:06.121] [log] [info] Load alignments from hinge.las
[2017-10-02 12:04:06.121] [log] [info] # Alignments: 7001874
[2017-10-02 12:04:08.924] [log] [info] # reads: 86719
[2017-10-02 12:04:08.924] [log] [info] # active reads: 4328/86719
[2017-10-02 12:04:08.924] [log] [info] Input data finished, part 1/1
[2017-10-02 12:04:10.359] [log] [info] kept 48304/7001874 overlaps,  23800/3385836 rev_overlaps in part 1/1
[2017-10-02 12:04:10.359] [log] [info] index finished
[2017-10-02 12:04:11.573] [log] [info] kept 48304/7001874 overlaps,  23800/3385836 rev_overlaps in 1 part(s)
[2017-10-02 12:04:11.576] [log] [info] 8569 overlaps
[2017-10-02 12:04:11.576] [log] [info] 4412 rev overlaps
[2017-10-02 12:04:11.578] [log] [info] removed contained reads, active reads: 4328
[2017-10-02 12:04:11.579] [log] [info] active reads: 4328
[2017-10-02 12:04:11.685] [log] [info] 36 killed hinges
[2017-10-02 12:04:11.685] [log] [info] 38 hinges
[2017-10-02 12:04:11.801] [log] [info] 38 active hinges
[2017-10-02 12:04:11.806] [log] [info] Building hinge graph
[2017-10-02 12:04:11.814] [log] [info] num hinges 175
[2017-10-02 12:04:11.824] [log] [info] Hinge graph built
Total number of components: 168
[2017-10-02 12:04:11.832] [log] [info] after filter 0 active hinges
[2017-10-02 12:04:11.855] [log] [info] Starting to build assembly graph.
[2017-10-02 12:04:11.871] [log] [info] sort and output finished
[2017-10-02 12:04:11.871] [log] [info] version 0.0.3

Executing the command
hinge clip hinge.edges.hinges hinge.hinge.list hinge_run_id
------------------------------------------------
0 bad coverage reads.
0 bad self aligned reads.
Traceback (most recent call last):
  File "/pipeline/HINGE/inst/bin/../lib/hinge/pruning_and_clipping.py", line 1362, in <module>
    mark_skipped_edges(G,flname.split('.')[0] + '.edges.skipped')
  File "/pipeline/HINGE/inst/bin/../lib/hinge/pruning_and_clipping.py", line 1018, in mark_skipped_edges
    G.edge[lines1[0] + "_" + lines1[3]][lines1[1] + "_" + lines1[4]]['skipped'] = 1
AttributeError: 'DiGraph' object has no attribute 'edge'
hinge clip hinge.edges.hinges hinge.hinge.list hinge_run_id did not produce a return code of 0, quiting!
govinda-kamath commented 6 years ago

Hi @asdcid, Ali,

It looks like networkx had a new release on 20 Sept 2017. And they have substantially changed the functionality. Would it be possible for you to use the old version of networkx for now? The code seems to work well with it.

This being a major change, would take us around two weeks to fix and test.

Thanks a ton for bringing this to our attention.

asdcid commented 6 years ago

@govinda-kamath Thanks for the quite reply. It works with networkx v1.9!!. I have another question: can Hinge assembly a genome with low read coverage? I tried to different coverage, and found if the coverage is lower than 20x, it fails to assembly a genome. I tries to change some parameters in nominal.ini, but still can't let it works. Do you have any suggestion?

govinda-kamath commented 6 years ago

Yes it should work, but one would expect the assembly to be more fragmented at low coverage. However, you could try lowering min_cov in the .ini file passed. (The trade-off at lower coverages is that one either gets mis-assemblies due to chimeric reads or fragmented assemblies. This is somewhat fundamental as one can not tell if a read is chimeric or not without using coverage information in general.)

asdcid commented 6 years ago

I set the min_cov to one, but can't make it work. log.txt

asdcid commented 6 years ago

The draft.fasta is an empty file. The input coverage is 10.

/home/raymond/work/Eucalyptus_pauciflora/genome/bin/assembly/hinge/data/pacbioName/lowCoverage//10coverage.pacbioName.fasta result//10coverage [2017-10-03 16:57:32.560] [log] [info] Reads filtering [2017-10-03 16:57:32.560] [log] [info] name of db: 10coverage, name of .las file 10coverage.las [2017-10-03 16:57:32.560] [log] [info] name of fasta: , name of .paf file [2017-10-03 16:57:32.560] [log] [info] Parameters passed in

[filter] length_threshold = 500; quality_threshold = 0.23; n_iter = 3; // filter iteration aln_threshold = 500; min_cov = 1; cut_off = 300; theta = 300; use_qv = true;

[running] n_proc = 12;

[draft] min_cov = 1; trim = 200; edge_safe = 100; tspace = 900; step = 50;

[consensus] min_length = 1000; trim_end = 200; best_n = 1; quality_threshold = 0.23;

[layout] hinge_slack = 1000 min_connected_component_size = 8

[2017-10-03 16:57:32.560] [log] [info] Las files: 10coverage.las [2017-10-03 16:57:32.560] [log] [info] # Reads: 88 [2017-10-03 16:57:32.569] [log] [info] No debug restrictions. [2017-10-03 16:57:32.570] [log] [info] use_qv_mask set to true [2017-10-03 16:57:32.570] [log] [info] use_qv_mask set to true [2017-10-03 16:57:32.570] [log] [info] number processes set to 12 [2017-10-03 16:57:32.570] [log] [info] LENGTH_THRESHOLD = 500 [2017-10-03 16:57:32.570] [log] [info] QUALITY_THRESHOLD = 0.23 [2017-10-03 16:57:32.570] [log] [info] N_ITER = 3 [2017-10-03 16:57:32.570] [log] [info] ALN_THRESHOLD = 500 [2017-10-03 16:57:32.570] [log] [info] MIN_COV = 1 [2017-10-03 16:57:32.570] [log] [info] CUT_OFF = 300 [2017-10-03 16:57:32.570] [log] [info] THETA = 300 [2017-10-03 16:57:32.570] [log] [info] EST_COV = 0 [2017-10-03 16:57:32.570] [log] [info] reso = 40 [2017-10-03 16:57:32.570] [log] [info] use_coverage_mask = true [2017-10-03 16:57:32.570] [log] [info] COVERAGE_FRACTION = 3 [2017-10-03 16:57:32.570] [log] [info] MIN_REPEAT_ANNOTATION_THRESHOLD = 10 [2017-10-03 16:57:32.570] [log] [info] MAX_REPEAT_ANNOTATION_THRESHOLD = 20 [2017-10-03 16:57:32.570] [log] [info] REPEAT_ANNOTATION_GAP_THRESHOLD = 300 [2017-10-03 16:57:32.570] [log] [info] NO_HINGE_REGION = 500 [2017-10-03 16:57:32.570] [log] [info] HINGE_MIN_SUPPORT = 7 [2017-10-03 16:57:32.570] [log] [info] HINGE_BIN_PILEUP_THRESHOLD = 7 [2017-10-03 16:57:32.570] [log] [info] HINGE_READ_UNBRIDGED_THRESHOLD = 6 [2017-10-03 16:57:32.570] [log] [info] HINGE_BIN_LENGTH = 200 [2017-10-03 16:57:32.570] [log] [info] HINGE_TOLERANCE_LENGTH = 100 [2017-10-03 16:57:32.570] [log] [info] part: 0 [2017-10-03 16:57:32.570] [log] [info] name of las: 10coverage.las [2017-10-03 16:57:32.570] [log] [info] Load alignments from 10coverage.las [2017-10-03 16:57:32.570] [log] [info] # Alignments: 3920 [2017-10-03 16:57:32.573] [log] [info] Input data finished, part 1/1 [2017-10-03 16:57:32.573] [log] [info] length of alignments 3920 [2017-10-03 16:57:32.573] [log] [info] begin 0 end 87 [2017-10-03 16:57:32.579] [log] [info] profile coverage (with and without CUT_OFF) [2017-10-03 16:57:32.604] [log] [info] profile coverage done part 1/1 [2017-10-03 16:57:32.605] [log] [info] Estimated mean coverage: 10 [2017-10-03 16:57:32.605] [log] [info] Estimated median coverage: 9 [2017-10-03 16:57:32.609] [log] [info] reached end of loop [2017-10-03 16:57:32.609] [log] [info] Number of hinges before filtering: 0 [2017-10-03 16:57:32.609] [log] [info] Number of hinges: 0 [2017-10-03 16:57:32.609] [log] [info] part: 0 hinge maximal [2017-10-03 16:57:32.618] [log] [info] Getting maximal reads [2017-10-03 16:57:32.618] [log] [info] name of db: 10coverage, name of .las file 10coverage.las [2017-10-03 16:57:32.618] [log] [info] name of fasta: , name of .paf file [2017-10-03 16:57:32.618] [log] [info] Parameters passed in

[filter] length_threshold = 500; quality_threshold = 0.23; n_iter = 3; // filter iteration aln_threshold = 500; min_cov = 1; cut_off = 300; theta = 300; use_qv = true;

[running] n_proc = 12;

[draft] min_cov = 1; trim = 200; edge_safe = 100; tspace = 900; step = 50;

[consensus] min_length = 1000; trim_end = 200; best_n = 1; quality_threshold = 0.23;

[layout] hinge_slack = 1000 min_connected_component_size = 8

[2017-10-03 16:57:32.618] [log] [info] Las files: 10coverage.las [2017-10-03 16:57:32.618] [log] [info] # Reads: 88 [2017-10-03 16:57:32.627] [log] [info] No debug restrictions. [2017-10-03 16:57:32.627] [log] [info] use_qv_mask set to true [2017-10-03 16:57:32.627] [log] [info] use_qv_mask set to true [2017-10-03 16:57:32.627] [log] [info] number processes set to 12 [2017-10-03 16:57:32.627] [log] [info] LENGTH_THRESHOLD = 500 [2017-10-03 16:57:32.627] [log] [info] QUALITY_THRESHOLD = 0.23 [2017-10-03 16:57:32.627] [log] [info] N_ITER = 3 [2017-10-03 16:57:32.627] [log] [info] ALN_THRESHOLD = 500 [2017-10-03 16:57:32.627] [log] [info] MIN_COV = 1 [2017-10-03 16:57:32.627] [log] [info] CUT_OFF = 300 [2017-10-03 16:57:32.627] [log] [info] THETA = 300 [2017-10-03 16:57:32.627] [log] [info] EST_COV = 0 [2017-10-03 16:57:32.627] [log] [info] reso = 40 [2017-10-03 16:57:32.627] [log] [info] use_coverage_mask = true [2017-10-03 16:57:32.627] [log] [info] COVERAGE_FRACTION = 3 [2017-10-03 16:57:32.627] [log] [info] MIN_REPEAT_ANNOTATION_THRESHOLD = 10 [2017-10-03 16:57:32.627] [log] [info] MAX_REPEAT_ANNOTATION_THRESHOLD = 20 [2017-10-03 16:57:32.627] [log] [info] REPEAT_ANNOTATION_GAP_THRESHOLD = 300 [2017-10-03 16:57:32.627] [log] [info] NO_HINGE_REGION = 500 [2017-10-03 16:57:32.627] [log] [info] HINGE_MIN_SUPPORT = 7 [2017-10-03 16:57:32.627] [log] [info] HINGE_BIN_PILEUP_THRESHOLD = 7 [2017-10-03 16:57:32.627] [log] [info] HINGE_READ_UNBRIDGED_THRESHOLD = 6 [2017-10-03 16:57:32.627] [log] [info] HINGE_BIN_LENGTH = 200 [2017-10-03 16:57:32.627] [log] [info] HINGE_TOLERANCE_LENGTH = 100 [2017-10-03 16:57:32.627] [log] [info] read mask finished [2017-10-03 16:57:32.627] [log] [info] active reads at start: 88 [2017-10-03 16:57:32.627] [log] [info] active reads after correcting for read lengths: 80 [2017-10-03 16:57:32.628] [log] [info] number of las files: 1 [2017-10-03 16:57:32.628] [log] [info] name of las: 10coverage.las [2017-10-03 16:57:32.628] [log] [info] Load alignments from 10coverage.las [2017-10-03 16:57:32.628] [log] [info] # Alignments: 3920 [2017-10-03 16:57:32.630] [log] [info] Input data finished, part 1/1 [2017-10-03 16:57:32.636] [log] [info] profile coverage (with and without CUT_OFF) [2017-10-03 16:57:32.659] [log] [info] profile coverage done part 1/1 [2017-10-03 16:57:32.660] [log] [info] Estimated mean coverage: 10 [2017-10-03 16:57:32.660] [log] [info] Estimated median coverage: 9 [2017-10-03 16:57:32.680] [log] [info] 523 overlaps [2017-10-03 16:57:32.680] [log] [info] 280 rev overlaps [2017-10-03 16:57:32.680] [log] [info] removed contained reads, active reads: 15 [2017-10-03 16:57:32.680] [log] [info] active reads: 15 [2017-10-03 16:57:32.680] [log] [info] total reads: 88 hinge layout [2017-10-03 16:57:32.688] [log] [info] Hinging layout [2017-10-03 16:57:32.688] [log] [info] name of db: 10coverage, name of .las file 10coverage.las [2017-10-03 16:57:32.688] [log] [info] name of fasta: , name of .paf file [2017-10-03 16:57:32.688] [log] [info] filter files prefix: 10coverage [2017-10-03 16:57:32.688] [log] [info] output prefix: 10coverage [2017-10-03 16:57:32.688] [log] [info] Multiple las files: false [2017-10-03 16:57:32.688] [log] [info] Multiple las files: false [2017-10-03 16:57:32.688] [log] [info] Parameters passed in

[filter] length_threshold = 500; quality_threshold = 0.23; n_iter = 3; // filter iteration aln_threshold = 500; min_cov = 1; cut_off = 300; theta = 300; use_qv = true;

[running] n_proc = 12;

[draft] min_cov = 1; trim = 200; edge_safe = 100; tspace = 900; step = 50;

[consensus] min_length = 1000; trim_end = 200; best_n = 1; quality_threshold = 0.23;

[layout] hinge_slack = 1000 min_connected_component_size = 8

[2017-10-03 16:57:32.688] [log] [info] # Reads: 88 [2017-10-03 16:57:32.695] [log] [info] Input data finished [2017-10-03 16:57:32.696] [log] [info] LENGTH_THRESHOLD = 500 [2017-10-03 16:57:32.696] [log] [info] QUALITY_THRESHOLD = 0.23 [2017-10-03 16:57:32.696] [log] [info] ALN_THRESHOLD = 500 [2017-10-03 16:57:32.696] [log] [info] MIN_COV = 1 [2017-10-03 16:57:32.696] [log] [info] CUT_OFF = 300 [2017-10-03 16:57:32.696] [log] [info] THETA = 300 [2017-10-03 16:57:32.696] [log] [info] N_ITER = 3 [2017-10-03 16:57:32.696] [log] [info] THETA2 = 0 [2017-10-03 16:57:32.696] [log] [info] N_PROC = 12 [2017-10-03 16:57:32.696] [log] [info] HINGE_SLACK = 1000 [2017-10-03 16:57:32.696] [log] [info] HINGE_TOLERANCE = 150 [2017-10-03 16:57:32.696] [log] [info] KILL_HINGE_OVERLAP_ALLOWANCE = 300 [2017-10-03 16:57:32.696] [log] [info] KILL_HINGE_INTERNAL_ALLOWANCE = 40 [2017-10-03 16:57:32.696] [log] [info] MATCHING_HINGE_SLACK = 200 [2017-10-03 16:57:32.696] [log] [info] MIN_CONNECTED_COMPONENT_SIZE = 8 [2017-10-03 16:57:32.696] [log] [info] USE_TWO_MATCHES = true [2017-10-03 16:57:32.696] [log] [info] del_telomeres = false [2017-10-03 16:57:32.696] [log] [info] read mask finished [2017-10-03 16:57:32.696] [log] [info] read marked repeats [2017-10-03 16:57:32.696] [log] [info] killed 0 reads with many repeats [2017-10-03 16:57:32.696] [log] [info] read marked hinges [2017-10-03 16:57:32.696] [log] [info] active reads: 88 [2017-10-03 16:57:32.696] [log] [info] active reads: 80 [2017-10-03 16:57:32.696] [log] [info] Multiple las files: false [2017-10-03 16:57:32.696] [log] [info] Las files: 10coverage.las [2017-10-03 16:57:32.696] [log] [info] number of las files: 1 [2017-10-03 16:57:32.696] [log] [info] Total number of active reads: 15/88 [2017-10-03 16:57:32.697] [log] [info] name of las: 10coverage.las [2017-10-03 16:57:32.697] [log] [info] Load alignments from 10coverage.las [2017-10-03 16:57:32.697] [log] [info] # Alignments: 3920 [2017-10-03 16:57:32.699] [log] [info] # reads: 88 [2017-10-03 16:57:32.699] [log] [info] # active reads: 15/88 [2017-10-03 16:57:32.699] [log] [info] Input data finished, part 1/1 [2017-10-03 16:57:32.699] [log] [info] kept 136/3920 overlaps, 84/2058 rev_overlaps in part 1/1 [2017-10-03 16:57:32.699] [log] [info] index finished [2017-10-03 16:57:32.701] [log] [info] kept 136/3920 overlaps, 84/2058 rev_overlaps in 1 part(s) [2017-10-03 16:57:32.701] [log] [info] 45 overlaps [2017-10-03 16:57:32.701] [log] [info] 26 rev overlaps [2017-10-03 16:57:32.701] [log] [info] removed contained reads, active reads: 15 [2017-10-03 16:57:32.701] [log] [info] active reads: 15 [2017-10-03 16:57:32.702] [log] [info] 0 killed hinges [2017-10-03 16:57:32.702] [log] [info] 0 hinges [2017-10-03 16:57:32.702] [log] [info] 0 active hinges [2017-10-03 16:57:32.702] [log] [info] Building hinge graph [2017-10-03 16:57:32.702] [log] [info] num hinges 0 [2017-10-03 16:57:32.702] [log] [info] Hinge graph built Total number of components: 0 [2017-10-03 16:57:32.702] [log] [info] after filter 0 active hinges [2017-10-03 16:57:32.702] [log] [info] Starting to build assembly graph. [2017-10-03 16:57:32.702] [log] [info] sort and output finished [2017-10-03 16:57:32.702] [log] [info] version 0.0.3 clip-nanopore draft assembly Number of contigs: 0 [2017-10-03 16:57:34.231] [log] [info] draft consensus [2017-10-03 16:57:34.231] [log] [info] name of db: 10coverage, name of .las file 10coverage.las [2017-10-03 16:57:34.231] [log] [info] name of fasta: , name of .paf file [2017-10-03 16:57:34.231] [log] [info] filter files prefix: 10coverage [2017-10-03 16:57:34.231] [log] [info] output prefix: 10coverage.draft [2017-10-03 16:57:34.231] [log] [info] Parameters passed in

[filter] length_threshold = 500; quality_threshold = 0.23; n_iter = 3; // filter iteration aln_threshold = 500; min_cov = 1; cut_off = 300; theta = 300; use_qv = true;

[running] n_proc = 12;

[draft] min_cov = 1; trim = 200; edge_safe = 100; tspace = 900; step = 50;

[consensus] min_length = 1000; trim_end = 200; best_n = 1; quality_threshold = 0.23;

[layout] hinge_slack = 1000 min_connected_component_size = 8

[2017-10-03 16:57:34.232] [log] [info] # Reads: 88 [2017-10-03 16:57:34.240] [log] [info] Total number of active reads: 15/88 [2017-10-03 16:57:34.240] [log] [info] part:0 [2017-10-03 16:57:34.240] [log] [info] name of las 10coverage.las [2017-10-03 16:57:34.240] [log] [info] Load alignment from 10coverage.las [2017-10-03 16:57:34.240] [log] [info] # Alignments: 3920 [2017-10-03 16:57:34.242] [log] [info] Input data finished [2017-10-03 16:57:34.242] [log] [info] LENGTH_THRESHOLD = 500 [2017-10-03 16:57:34.242] [log] [info] QUALITY_THRESHOLD = 0.23 [2017-10-03 16:57:34.242] [log] [info] ALN_THRESHOLD = 500 [2017-10-03 16:57:34.242] [log] [info] MIN_COV = 1 [2017-10-03 16:57:34.242] [log] [info] CUT_OFF = 300 [2017-10-03 16:57:34.242] [log] [info] THETA = 300 [2017-10-03 16:57:34.242] [log] [info] N_ITER = 3 [2017-10-03 16:57:34.242] [log] [info] THETA2 = 0 [2017-10-03 16:57:34.242] [log] [info] N_PROC = 12 [2017-10-03 16:57:34.242] [log] [info] HINGE_SLACK = 1000 [2017-10-03 16:57:34.242] [log] [info] HINGE_TOLERANCE = 150 [2017-10-03 16:57:34.242] [log] [info] KILL_HINGE_OVERLAP_ALLOWANCE = 300 [2017-10-03 16:57:34.242] [log] [info] KILL_HINGE_INTERNAL_ALLOWANCE = 40 [2017-10-03 16:57:34.242] [log] [info] MATCHING_HINGE_SLACK = 200 [2017-10-03 16:57:34.242] [log] [info] MIN_CONNECTED_COMPONENT_SIZE = 8 add data add data

list size:0 get consensus assembly

govinda-kamath commented 6 years ago

What is the length of the genome you're trying to assemble?

10x coverage is not a use case we have tested much and when assembling with 15 maximal reads and 45 overlaps, some thresholds picked elsewhere may come into play. However setting use_qv_mask and use_coverage_mask to false might help.

asdcid commented 6 years ago

the length of genome is 160kb

asdcid commented 6 years ago

Whatever, Hinge is good. It can assemble a more complete genome compared to Canu, using the same coverage, default setting.

mictadlo commented 6 years ago

By any chance, do you know which networkx version is supported?

asdcid commented 6 years ago

v1.9