maickrau / ribotin

MIT License
30 stars 2 forks source link

Selection of rDNA tangle nodes #10

Closed jiadong324 closed 1 week ago

jiadong324 commented 9 months ago

Hi,

Thanks for providing the nice tool for rDNA analysis!

I got the verkko assembly and the assembly.homopolymer-compressed.noseq.gfa. I am wondering how to manually select rDNA tangles form that file.

Looking forward to your reply! Thanks!

maickrau commented 9 months ago

Hi, you can first open the assembly graph in bandage (https://github.com/rrwick/Bandage) and find the rDNA tangles. When you've found the rDNA tangles, you can select the nodes in bandage and then copy-paste them from the "Selected nodes" in the right side of Bandage into a file, with one file per tangle. Then give the tangle files to ribotin-verkko with the parameter -c, for example if you have two tangles in files tangle1.txt and tangle2.txt then add the parameters -c tangle1.txt -c tangle2.txt.

jiadong324 commented 7 months ago

Hi Mikko,

I am using ribotin-verkko in the latest version. Here is the command ribotin-verkko -x human -i $asm_dir -o $asm_dir/ribotin and I got error as below:

terminate called after throwing an instance of 'std::logic_error'
what():  basic_string: construction from null is not valid

Looking forward to your reply! Thanks!

maickrau commented 7 months ago

Could you share the full command and log output?

jiadong324 commented 7 months ago

Here is the full command

module load verkko/2.0

asm_dir=./assembler_acros/verkko_v2/NA12877 

verkko -d $asm_dir --hifi $( cat $asm_dir/fastq.fofn ) --nano $( cat $asm_dir/ont_read.fofn ) --hap-kmers ./meryl/NA12877/MATERNAL_all.compress.k30.meryl ./meryl/NA12877/PATERNAL_all.compress.k30.meryl trio --snakeopts "-j 50 -k --restart-times 1" --screen human

module load ribotin/1.3
ribotin-verkko -x human -i $asm_dir -o $asm_dir/ribotin

Verkko's outputs are good, and log for ribotin-verkko:

ribotin-verkko version bioconda 1.3
checking for MBG
/software/modules-sw/ribotin/1.3/Linux/CentOS7/x86_64/bin/MBG
checking for GraphAligner
/software/modules-sw/ribotin/1.3/Linux/CentOS7/x86_64/bin/GraphAligner
checking for liftoff
/software/modules-sw/ribotin/1.3/Linux/CentOS7/x86_64/bin/liftoff
do UL analysis: yes
output prefix: /assembler_acros/verkko_v2/NA12877/ribotin
guessing tangles
terminate called after throwing an instance of 'std::logic_error'
  what():  basic_string: construction from null is not valid
Aborted
maickrau commented 7 months ago

Looks like there might be some issue with ribotin trying to load the template sequences when ribotin is in a module. Could you try downloading the template sequences folder https://github.com/maickrau/ribotin/tree/master/template_seqs somewhere and then adding the parameters

--guess-tangles-using-reference <templatefolder>/chm13_rDNAs.fa --orient-by-reference <templatefolder>/rDNA_one_unit.fasta --annotation-reference-fasta <templatefolder>/rDNA_one_unit.fasta --annotation-gff3 <templatefolder>/rDNA_annotation.gff3

with <templatefolder> replaced by where you downloaded the template sequences

jiadong324 commented 7 months ago

Here is my command:

ribotin-verkko -x human -i $asm_dir -o $asm_dir/ribotin --guess-tangles-using-reference ./ribotin/template_seqs/chm13_rDNAs.fa --orient-by-reference ./ribotin/template_seqs/rDNA_one_unit.fasta --annotation-reference-fasta ./ribotin/template_seqs/rDNA_one_unit.fasta --annotation-gff3 ./ribotin/template_seqs/rDNA_annotation.gff3

I got the same error:

ribotin-verkko version bioconda 1.3
checking for MBG
/software/modules-sw/ribotin/1.3/Linux/CentOS7/x86_64/bin/MBG
checking for GraphAligner
/software/modules-sw/ribotin/1.3/Linux/CentOS7/x86_64/bin/GraphAligner
checking for liftoff
/software/modules-sw/ribotin/1.3/Linux/CentOS7/x86_64/bin/liftoff
do UL analysis: yes
output prefix: /assembler_acros/verkko_v2/NA12877/ribotin
terminate called after throwing an instance of 'std::logic_error'
  what():  basic_string: construction from null is not valid
Aborted

It looks like that ribotin cannnot find the file for construction. Here are the files under $asm_dir:

drwxrwsr-x 6 jdlin eichlerlab        4096 Mar 26 17:33 0-correction
drwxrwsr-x 2 jdlin eichlerlab        4096 Mar 27 02:38 1-buildGraph
drwxrwsr-x 2 jdlin eichlerlab        4096 Mar 27 15:14 2-processGraph
drwxrwsr-x 3 jdlin eichlerlab        8192 Mar 27 13:04 3-align
drwxrwsr-x 3 jdlin eichlerlab        4096 Mar 27 13:56 3-alignTips
drwxrwsr-x 2 jdlin eichlerlab        8192 Mar 27 15:14 4-processONT
drwxrwsr-x 2 jdlin eichlerlab        4096 Mar 27 15:15 5-untip
drwxrwsr-x 2 jdlin eichlerlab        4096 Mar 27 16:19 6-layoutContigs
drwxrwsr-x 2 jdlin eichlerlab        4096 Mar 27 16:06 6-rukki
drwxrwsr-x 3 jdlin eichlerlab        4096 Mar 28 09:17 7-consensus
-rw-rw-r-- 1 jdlin eichlerlab     1529131 Mar 27 16:06 assembly.colors.csv
-rw-rw-r-- 1 jdlin eichlerlab    19140523 Mar 28 09:16 assembly.disconnected.fasta
-rw-rw-r-- 1 jdlin eichlerlab           0 Mar 28 09:16 assembly.ebv.exemplar.fasta
-rw-rw-r-- 1 jdlin eichlerlab           0 Mar 28 09:16 assembly.ebv.fasta
-rw-rw-r-- 1 jdlin eichlerlab  6804259509 Mar 28 09:18 assembly.fasta
-rw-rw-r-- 1 jdlin eichlerlab  2499260672 Mar 28 09:18 assembly.haplotype1.fasta
-rw-rw-r-- 1 jdlin eichlerlab  2402573516 Mar 28 09:18 assembly.haplotype2.fasta
-rw-rw-r-- 1 jdlin eichlerlab      817662 Mar 28 09:18 assembly.hifi-coverage.csv
-rw-rw-r-- 1 jdlin eichlerlab  3909042110 Mar 28 09:18 assembly.homopolymer-compressed.gfa
-rw-rw-r-- 1 jdlin eichlerlab   607435189 Mar 28 09:18 assembly.homopolymer-compressed.layout
-rw-rw-r-- 1 jdlin eichlerlab     4896500 Mar 28 09:18 assembly.homopolymer-compressed.noseq.gfa
-rw-rw-r-- 1 jdlin eichlerlab       16871 Mar 28 09:16 assembly.mito.exemplar.fasta
-rw-rw-r-- 1 jdlin eichlerlab       88917 Mar 28 09:16 assembly.mito.fasta
-rw-rw-r-- 1 jdlin eichlerlab      807000 Mar 28 09:18 assembly.ont-coverage.csv
-rw-rw-r-- 1 jdlin eichlerlab      964923 Mar 27 16:06 assembly.paths.tsv
-rw-rw-r-- 1 jdlin eichlerlab       46211 Mar 28 09:16 assembly.rdna.exemplar.fasta
-rw-rw-r-- 1 jdlin eichlerlab    11331119 Mar 28 09:16 assembly.rdna.fasta
-rw-rw-r-- 1 jdlin eichlerlab      793086 Mar 28 09:18 assembly.scfmap
-rw-rw-r-- 1 jdlin eichlerlab  1902425321 Mar 28 09:18 assembly.unassigned.fasta
-rw-rw-r-- 1 jdlin eichlerlab 39763943318 Mar 26 17:33 hifi-corrected.fasta.gz
-rw-rw-r-- 1 jdlin eichlerlab        1574 Mar 21 09:40 ont_read.fofn
-rwxrwxr-x 1 jdlin eichlerlab         462 Mar 28 11:32 snakemake.sh
-rw-rw-r-- 1 jdlin eichlerlab        7153 Mar 28 11:32 verkko.yml
jiadong324 commented 1 week ago

The problem is solved by adding -x human -x -c and the parameters for template sequence.