maickrau / ribotin

MIT License
30 stars 2 forks source link

Split cluster into chromosome-specific #4

Closed baozg closed 1 year ago

baozg commented 1 year ago

Hi, @maickrau

I am using ribotin for a diploid plant genome, three chromosomes rDNA cluster was mixed together after the run after verkko. Since we didn't the specific unit of rDNA (without prior information), I use srf to assemble high freq kmer and get one 9kb unit which only distributed 3 chromosomes and the size was reasonable.

But the ribotin only gives a big cluster, could you give me some clues on how to find the chromosome-specific rDNA? I think in human 5 rDNA loci should have similar issues, right?


~/software/ribotin/bin/ribotin-verkko -i ../test --mbg ~/miniconda3/envs/verkko1.3/bin/MBG -o ./test_ribo --guess-clusters-using-reference test.rDNA.srf.fa
image
maickrau commented 1 year ago

The separation in the automatic verkko mode depends on how well the verkko graph has already separated the rDNA tangles. This graph has a couple of nodes connecting the tangles, which makes the automatic tangle detection think they are all one big tangle. You can try manually picking the clusters and then using the verkko based manual mode: https://github.com/maickrau/ribotin#verkko-based-manual

baozg commented 1 year ago

Since the rDNA node is very similar, how could we know which node should belong to which chroms? For manual mode, I think this is the required information

maickrau commented 1 year ago

We can't know for sure. The semi-separated parts might be separate chromosomes, but it's not certain since they could also be heterozygosity within a chromosome. From the bandage plot it looks like there might be 3-5 tangles. I've circled the parts which might be either separate chromosomes or variants within chromosomes. It's also a bit unclear and subjective where the nodes between the tangles should be assigned. If you only include the nodes which are clearly inside one tangle and ignore the nodes between the tangles, you'll probably get reasonable consensuses of some of the major rDNA units, but it's not certain whether they really are chromosome specific. maybe_rDNA_tangles

baozg commented 1 year ago

Thanks for your kind and clear explanation, it helped me a lot!