ComparativeGenomicsToolkit / cactus

Official home of genome aligner based upon notion of Cactus graphs
Other
503 stars 110 forks source link

How to visualize the graph generated by cactus? #105

Open tobsecret opened 4 years ago

tobsecret commented 4 years ago

I took a look at https://github.com/vgteam/sequenceTubeMap but it appears it's not possible to just look at the graph itself - it only let's us visualize aligned reads.

joelarmstrong commented 4 years ago

We usually use snake tracks on the UCSC browser to visualize the resulting alignment. If you're interested in the graph structure specifically, the Cactus graph data structure itself is only produced for a single subproblem (a single ancestral inference) and isn't present in the final HAL file.

It would be possible (though not easy) to build the graph for the full multiple alignment after the fact and visualize it somehow, but it would require quite a lot of memory. The structure of the graph for large alignments would also likely be pretty tangled.

tobsecret commented 4 years ago

Hmm, that I did not know. I read the structural variant vg paper and much like they made the yeast graph genome from multiple reference genome assemblies using cactus, I made one with reference genomes of a different organism.

Before I proceed, I want to understand which structural variants (SVs) my HAL file captures and understand what the alignment looks like in the neighborhood of those SVs to see if the SV is the result of spurious alignment or not. What I have tried is converting to vg format and then following their guide on visualization. I have also tried just looking at the MAF (from hal2maf) in IGV, I wonder how that displays insertions/ inversions and other non-deletion SVs though.

If this is out of the scope of this repo, I can go ask in the repo for that paper.