Closed LemoAlex closed 10 months ago
Hi Alexandre,
you are on the right track. Two ways to do this.
1) You can add columns start
and end
to your all_seqs
.
all_seqs <- mutate(start=c(50000,500000,1000000, ...), end=c(110000,560000,1060000, ...)
2) You can provide a table with locus coordinates for all your sequences (contigs) to focus.
You can use focus for this like this:
loci <- tribble(
~seq_id, ~start, ~end,
"genome1", 50000, 110000, # probably more realistic would be genomeA_contig1 or so
"genome2", 500000, 560000,
# ...
)
gggenomes(...) |> focus(.loci=loci)
Hope that helps
Thank you for the answer!
I did try the first method, because the second one, with many genes in the region, might be too tedious.
So:
all_seqs <- mutate(start=c(50000,500000,1000000, ...), end=c(110000,560000,1060000, ...)
But I get the error message
Error in UseMethod("mutate") :
no applicable method for 'mutate' applied to an object of class "c('double', 'numeric')"
When I look at my all_seqs file, it looks like this:
file_id seq_id seq_desc length
Hello again,
Sorry for the previous post, I fixed the issue by simply changing the line to:
all_seqs %>% mutate(start=c(50000,500000,1000000, ...), end=c(110000,560000,1060000, ...)
and it worked as I now have two extra columns in the all_seqs object.
However, how can I tell gggenomes to now only plot those regions rather than the entire genome ?
Cheers,
Alexandre
You need to give all_seqs
with added start,end
as seqs to gggenomes.
all_seqs_<-read_seqs('6genomes.fa',.id='file_id')
all_genomes_genes <- read_gff("genomes.gff")
plot1 <- gggenomes(seq=A118_seqs, A118_genes)
plot1
all_loci <- all_seqs %>% mutate(start=c(50000,500000,1000000, ...), end=c(110000,560000,1060000, ...)
plot2 <- gggenomes(seq=all_loci, all_genes)
plot2
Dear gggenomes users,
I have 6 bacterial genomes of ~3MB each, and I am interested in plotting genes which are mapped into different regions of each genome. I have my fasta file containing the 6 genomes and the gff files containg the 6 annotations. Up until here, it seems to be working :
This works. However, as my gff files (obtained with prodigal) contain too many CDS, the plot is unreadable.
Anyhow, I would like to map different equivalent regions of ~60 KB , as, for instance:
genome 1 : 50000 to 110000 position genome2 : 500000 to 560000 position genome3 : 1000000 to 1060000 position etc...
These correspond to conserved regions which normally should show similar gene content.
How can this be done with the package ? I think it might be with the focus () option, but I am unsure about the procedure here.
Thank you for your help and for this very useful package,
Alexandre