bcgsc / ntJoin

🔗Genome assembly scaffolder using minimizer graphs
GNU General Public License v3.0
82 stars 15 forks source link

Viral contigs #92

Closed ohan-Bioinfo closed 1 year ago

ohan-Bioinfo commented 2 years ago

Dear Developer

Thank you for the great tool

kindly I'm trying to scaffold a viral genome containing only three contigs, and the error below occurred.

Can you please advice

ntjoin-1.1.1-0/bin/ntjoin_assemble.py -p out.k32.w1000.n1 -n 1 -s Minmumlength1000.fa.k32.w1000.tsv -l 1 \
-r "1" -k 32 -g 20 -G 0 -t 1 --overlap --overlap_gap 20 --btllib_t 4 --overlap_k 15 --overlap_w 10 ../../../../../../Reff-Index/96Campoxrefrence.fa.k32.w1000.tsv
Running ntJoin v1.1.1 ...

Parameters:
    Reference TSV files:  ['../../../../../../Reff-Index/96Campoxrefrence.fa.k32.w1000.tsv']
    -s  Minmumlength1000.fa.k32.w1000.tsv
    -l  1.0
    -r  1
    -p  out.k32.w1000.n1
    -n  1
    -k  32
    -g  20
    -G  0
    -t  1
Orienting contigs using increasing/decreasing minimizer positions

    --overlap
    --overlap_gap 20
    --overlap_k 15
    --overlap_w 10
    --btllib_t 4
2022-08-14 14:10:45.976616 : Reading minimizers ../../../../../../Reff-Index/96Campoxrefrence.fa.k32.w1000.tsv
2022-08-14 14:10:45.977256 : Reading minimizers Minmumlength1000.fa.k32.w1000.tsv

Weights of assemblies:
../../../../../../Reff-Index/96Campoxrefrence.fa.k32.w1000.tsv: 1.0
Minmumlength1000.fa.k32.w1000.tsv: 1.0

2022-08-14 14:10:45.977782 : Filtering minimizers
2022-08-14 14:10:45.977975 : Building graph
2022-08-14 14:10:45.978519 : Adding vertices
2022-08-14 14:10:45.978608 : Adding edges
2022-08-14 14:10:45.978713 : Adding attributes
2022-08-14 14:10:45.979399 : Printing graph out.k32.w1000.n1.mx.dot

file_name   number  colour
../../../../../../Reff-Index/96Campoxrefrence.fa.k32.w1000.tsv  0   red
Minmumlength1000.fa.k32.w1000.tsv   1   green

2022-08-14 14:10:45.981088 : Filtering the graph
2022-08-14 14:10:45.982124 : Reading fasta file Minmumlength1000.fa
2022-08-14 14:10:45.993526 : Finding paths

Total number of components in graph: 1

2022-08-14 14:10:46.014378 : Printing output scaffolds
Traceback (most recent call last):
  File "/home/miniconda3/bin/share/ntjoin-1.1.1-0/bin/ntjoin_assemble.py", line 1115, in <module>
    main()
  File "/home/miniconda3/bin/share/ntjoin-1.1.1-0/bin/ntjoin_assemble.py", line 1112, in main
    Ntjoin().main()
  File "/home/miniconda3/bin/share/ntjoin-1.1.1-0/bin/ntjoin_assemble.py", line 1102, in main
    self.print_scaffolds(paths, intersecting_regions)
  File "/home/miniconda3/bin/share/ntjoin-1.1.1-0/bin/ntjoin_assemble.py", line 830, in print_scaffolds
    self.adjust_for_trimming(self.args.p + ".segments.fa", filtered_paths)
  File "/home/miniconda3/bin/share/ntjoin-1.1.1-0/bin/ntjoin_assemble.py", line 727, in adjust_for_trimming
    for node in paths[cur_path_index]}
IndexError: list index out of range
make: *** [/home/miniconda3/bin/share/ntjoin-1.1.1-0/ntJoin:227: Minmumlength1000.fa.k32.w1000.n1.assigned.scaffolds.fa] Error 1
make: *** Deleting file 'Minmumlength1000.fa.k32.w1000.n1.assigned.scaffolds.fa'
lcoombe commented 2 years ago

Hello @ohan-Bioinfo,

Apologies for the delay in getting back to you - I was on vacation last week.

A couple initial questions to help with troubleshooting:

Thank you for your interest in ntJoin! Lauren

github-actions[bot] commented 2 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your interest in ntJoin!

Jimaz commented 2 years ago

Hello, @lcoombe,

thank's for this great tool, it's awsome. I have the same error when trying to scaffold an assembly using a bacterial ncbi reference genome containing a chromosome and a plasmid sequence.

Heres is the command I used:

$ for assembly in *fasta; do ../ntJoin-1.1.1/ntJoin assemble target=${assembly} target_weight=1 reference_config=config_file.csv k=32 w=500; done

Thanks in advance.

Cheers,

Jaime

EDIT:

I was able to run ntJoin previously with other reference without trouble.

EDIT2:

I got the solution, based on the parameter considerations that you recommend setting the reference weight(s) to be higher than the target weight, unless you trust each of the input references and target equally, it has to be this way, at least for me. I modified the config file, setting the weight of the reference from 2 to 1 and the command I used was:

$ for assembly in *fasta; do ../ntJoin-1.1.1/ntJoin assemble target=${assembly} target_weight=2 reference_config=config_file.csv k=32 w=500; done

Setting the target_weight to 2 instead of 1.

Now it works. Thank's a lot!

lcoombe commented 1 year ago

Hi @Jimaz,

Thanks for your interest in ntJoin and documenting your solution here! I'm glad to hear you got it working.

Thanks! Lauren