thackl / gggenomes

A grammar of graphics for comparative genomics
https://thackl.github.io/gggenomes/
Other
587 stars 64 forks source link

Generate blast links with NCBI blast, easy to use row.names() #188

Open yzhong005 opened 3 months ago

yzhong005 commented 3 months ago

For newbies, who are not familiar with shell. Or not family with other blast tools or packages. If you want to generate pairwise blast links with online NCBI Blast (https://blast.ncbi.nlm.nih.gov/Blast.cgi), here are some tricks and codes that can make your life easier. After pairwise blast with NCBI blast, download the hit table .csv, and read it. Change the colnames as blast_name<-c("seq_id","seq_id2","identity","length","mismatches","gaps","start","end","start2","end2","evalue","bitscores") Double-check that your seq_id and seq_id2 are correct and match with your seqs data. Then you can use it to draw your links now.

thackl commented 3 months ago

Thanks! As an alternative, you can also use gggenomes default names for blast tables. And if you need to swap your seq_id and seq_id2, you can use swap_query()

library(tidyverse)
library(gggenomes)

s0 <- tibble(
  seq_id = c("BVI_023A", "Cflag_131"),
  length = 5000)

l0 <- read_csv("~/Downloads/73595JMU114-Alignment-HitTable.csv", col_names=def_names("blast"))

gggenomes(seqs=s0, links=l0) +
  geom_seq() + geom_link()