thackl / gggenomes

A grammar of graphics for comparative genomics
https://thackl.github.io/gggenomes/
Other
572 stars 64 forks source link

geom_link() is not working properly #195

Open tahsinkhann opened 1 week ago

tahsinkhann commented 1 week ago

Hi, I am trying to create synteny between phages.

upload files

s0 <- read_seqs("combined_contigs.fasta") g0 <- read_feats("combined.gff") colnames(g0)[18] <- "func"

gggenomes(genes=g0, seqs=s0) + geom_seq() + geom_gene() + geom_bin_label() + geom_gene(aes(fill=func)) + scale_fill_brewer("Function", palette="Set3")

plot

I used all vs all nuct blast, and uploaded it in r l0 <- read.csv("homologs.txt", sep="\t", col.names = def_names("blast"))

drawing links

gggenomes(seqs=s0,genes=g0,links=l0) + geom_seq() + geom_gene() + geom_seq_label() + geom_link()

Now receiving the following error

Error in geom_link(): ! Problem while computing aesthetics. ℹ Error occurred in the 5th layer. Caused by error in .data$x: ! Column x not found in .data. Run rlang::last_trace() to see where the error occurred. Warning message: In layout_links(x, seqs, ...) : No links found between adjacent genomes in provided order of genomes, consider reordering genomes

also can't use add_links() or add_sublinks() and receive the folloing error: _Error in UseMethod("add_sublinks") : no applicable method for 'addsublinks' applied to an object of class "data.frame"

I have 10 other phages to compare the synteny, the solution will save my life (I mean it). Thanks

data.zip

Rikkiff commented 6 days ago

Hi. It seems to me that the seq id (e.g. VCI28MH2-S_0078) in your alignment is in a different format from the seq id in your fasta file (e.g. VCI28MH2-S) and gff file (e.g. VCI28MH2-S). I guess the seq id needs to match in order to infer any links.

tahsinkhann commented 5 days ago

Dear Colleagues,

Thanks for your response. I somehow managed to do it by sniffing the easyfig/mauve (https://github.com/thackl/gggenomes/issues/82) tutorial. gggenomes is an excellent and powerful tool for comparing genomes. Unfortunately, I am naive in R and I would request and recommend enriching the documentation and tutorials.

But I really appreciate your efforts in solving queries case by case from naive people like me.

Thanks.

On Wed, Jul 10, 2024 at 12:09 PM Rikkiff @.***> wrote:

Hi. It seems to me that the seq id (e.g. VCI28MH2-S_0078) in your alignment is in a different format from the seq id in your fasta file (e.g. VCI28MH2-S) and gff file (e.g. VCI28MH2-S). I guess the seq id needs to match in order to infer any links.

— Reply to this email directly, view it on GitHub https://github.com/thackl/gggenomes/issues/195#issuecomment-2220228488, or unsubscribe https://github.com/notifications/unsubscribe-auth/AKLILJKCOCJUINVFPFRSH33ZLUI53AVCNFSM6AAAAABKN432DCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDEMRQGIZDQNBYHA . You are receiving this because you authored the thread.Message ID: @.***>

-- Tahsin Khan MPhil Student University of Cambridge & Wellcome Sanger Institute Cambridge, United Kingdom Email: @.; @.

Rikkiff commented 3 days ago

Great that you made it work. But can you confirm whether or not the issue was the seq ids of your alignment?