vgteam / sequenceTubeMap

displays multiple genomic sequences in the form of a tube map
MIT License
177 stars 24 forks source link

Visualizing a graph from short sequences #446

Open pbreeder opened 2 months ago

pbreeder commented 2 months ago

Hi all, I want to print an image that sequenceTubeMap. I have used these codes for seuenceTubeMap. Please, help me with the image.

my_seq is below;

Pper_ch7_3864816-3866050_INV TTCACCTGTTTTGAGTTTATTCCTGAGATGGAACGGCAAAGTTCCCCACTGCCTTGGAAACCATTGATTGGTCAACAAGTCAGCTTTCTTTGGTGTATGAGCGTGCCACTACTAGCAATAAAGATTATAGCAGACTTCTTTA Pav_ch6_32611111-32611754_INV TTCACCCGTTTTGATTTGTCTTCTGTGATGGTAGGCAAAGTTCCCCATTGCCTTGGAAACCATTGACTGGTCAACAAGTCAACTTTCTTTAGTGTGCGAGCGTGCCACTGCTAGCAATGAAAAATCCTAGCAGACTTCTTTA Pdul_ch7_3673417-3674657_INV TTCACCTGTTTTGAGTTTATTCCTTAGATGGAAGGGCAAAGTTCCCCACTGCCTTGGAAACCATTGATTGGTCAACAAGTCAGCTTTCTTTGGTGTATGAGCGTGCCACTACTAGCAATAAAGATTCTAGCAGACTTCTTTA Pmume_ch7_554429-556046_INV TTCACCCGTTTTGCGTTTACTCCTGTGGTGGAAGGGCAAAGTTCCCCACTGCCTTGTGAACCATTGACTGGTCAACAAGTCAGCTCTCTTTGGTGTGCGAGCGTGCCACTGCTAGCAATGAAGATTCTAGCAGACTTAGTTT Par_chr1_7788452-7790085 TGTATTTGTTCACCCGTTTTGTGATCACTCCTGTGGTGGAAGGGCAAAGTTCCCCACTGCCTTGAAAACCATTGACTGGTCAACAAATCAGCTTTTTTTGGTGTATGAGCGTGTCACTGCTAGCAATGAAGATTCTAGCAGA

vg construct -r reference.fa > graph.vg vg index -x graph.xg graph.vg vg index -g graph.gam graph.vg vg view -vg graph.vg > graph.gfa vg gbwt -o graph.gbwt -G graph.gfa

Resim1

Please, help me for image.

adamnovak commented 2 months ago

It sounds like you want to take those sequences, align them together, produce a graph from that, and visualize it.

You have a FASTA like this:

>Pper_ch7_3864816-3866050_INV
TTCACCTGTTTTGAGTTTATTCCTGAGATGGAACGGCAAAGTTCCCCACTGCCTTGGAAACCATTGATTGGTCAACAAGTCAGCTTTCTTTGGTGTATGAGCGTGCCACTACTAGCAATAAAGATTATAGCAGACTTCTTTA
>Pav_ch6_32611111-32611754_INV
TTCACCCGTTTTGATTTGTCTTCTGTGATGGTAGGCAAAGTTCCCCATTGCCTTGGAAACCATTGACTGGTCAACAAGTCAACTTTCTTTAGTGTGCGAGCGTGCCACTGCTAGCAATGAAAAATCCTAGCAGACTTCTTTA
>Pdul_ch7_3673417-3674657_INV
TTCACCTGTTTTGAGTTTATTCCTTAGATGGAAGGGCAAAGTTCCCCACTGCCTTGGAAACCATTGATTGGTCAACAAGTCAGCTTTCTTTGGTGTATGAGCGTGCCACTACTAGCAATAAAGATTCTAGCAGACTTCTTTA
>Pmume_ch7_554429-556046_INV
TTCACCCGTTTTGCGTTTACTCCTGTGGTGGAAGGGCAAAGTTCCCCACTGCCTTGTGAACCATTGACTGGTCAACAAGTCAGCTCTCTTTGGTGTGCGAGCGTGCCACTGCTAGCAATGAAGATTCTAGCAGACTTAGTTT
>Par_chr1_7788452-7790085
TGTATTTGTTCACCCGTTTTGTGATCACTCCTGTGGTGGAAGGGCAAAGTTCCCCACTGCCTTGAAAACCATTGACTGGTCAACAAATCAGCTTTTTTTGGTGTATGAGCGTGTCACTGCTAGCAATGAAGATTCTAGCAGA

To build the graph you want. you can't just use vg construct; it is meant for when you have a linear reference and then variation expressed as a VCF. It won't align any of the sequences in your FASTA against each other.

I ususally recommend building a graph from sequences with the minigraph-cactus pipeline, but that's kind of heavyweight and meant for full assemblies, not tiny sequences.

Since your sequences are so short and can't make too much of a hairball, you might have good luck with the pggb tool for squashing them together into a graph. Or, you could make a multiple alignment MAF of them (maybe making the names be of the form assembly.contig first) and then take that through the maf2hal tool and then the hal2vg tool to turn it into a vg graph.

You can also try the old vg msga tool in vg to try and build a graph. It's old and unmaintained, but these sequences are short and so it shouldn't be able to get into trouble.

vg msga -f sequences.fa >graph.vg

I got this graph from vg msga: graph.vg.zip

(You definitely can't vg index -g graph.gam graph.vg; that makes a .gcsa file. You get a .gam file by aligning sequencing reads with vg map or vg giraffe or GraphAligner.)

Then you can visualize that with vg's built-in tools and GraphViz:

vg view -dp graph.vg | dot -Tpng -o test.png

test

That looks kind of plausible.

Than you need to get the tube map working. You've posted a screenshot of what looks like quite an old version of the tube map, maybe the old Docker version. I'd recommend using the current version by following the installation instructions, or since your file is small you can upload it to the online demo server which we updated a couple weeks ago.

If you want to use your file via upload you don't need to use the import scripts in the README. You can instead:

  1. Set "Data" to "custom"
  2. Click "Configure Tracks"
  3. Click "+"
  4. Change from "mounted" to "upload"
  5. Hit the "Browse" button and pick your .vg file
  6. Close the track configuration dialog by hitting the "X" button in the upper right corner
  7. Click in the "Region" box and type coordinates on one of your paths to look at, like Par_chr1_7788452-7790085:1-141.
  8. Click the "Go" button

If you want to download the image, you can click "Download Image". But you might run into #448 and need to fix the resulting image in Inkscape.