Closed WWz33 closed 1 month ago
Hi, @WWz33
Thanks a lot for your feedback! :)
What do you mean by 'add the Phytozome API'?
Hi, Fabricio
biomart has been possible to download genomic data from ensembl and NCBI,not Phytozome.
Have you considered downloading and converting to 'pdata' directly from these sites?This will simplify the frustrating data preparation process.
And I don't quite understand the logic in classify_gene_pairs(,scheme="extended")
.If I have 5 target species and one outgroup, how do I build blast_inter
?
I would appreciate it if you would answer for me!
Best!
Hi, @WWz33
That's a very good idea. I thought about it when writing the vignette for doubletrouble, but at that time {biomartr} didn't have the option to download data from Ensembl Genomes instances. Now this functionality seems to be stable, so I will see if this is possible. I have to check how long it takes to download the data in the vignette, because there's a time limit for vignettes to run.
Regarding the blast_inter parameter of classify_gene_pairs()
, you would do exactly as documented here, but in your data frame of comparisons (named comparisons
in the vignette) you would have multiple rows indicating what are the query species and what are the outgroup species. For example, if you have species spA, spB, spC, and spD, which share the same outgroup spX, you would build your data frame with:
comparisons <- data.frame(
species = c("spA", "spB", "spC", "spD"),
outgroup = "spX"
)
Then, you can run the interspecies DIAMOND searches with:
diamond_inter <- run_diamond(
seq = pdata$seq,
compare = comparisons,
outdir = file.path(tempdir(), "diamond_inter"),
... = "--sensitive"
)
You just have to make sure that species names in the comparisons
data frame match species names in pdata
.
Does that answer your question?
Best, Fabricio
Hi, Fabricio
Thank you very much for your patient explanation. I think I've understood.
Best regards!
Hi, @WWz33
Based on your feedback, I just pushed a new version of {doubletrouble} with an expanded vignette that includes more info on the input data and how to obtain data using {biomartr} (see here).
I'll close this issue now.
All the best, Fabricio
Hi, Fabricio I have learned a lot about comparative genomics by studying your packages and articles,thanks. Would you like to add the Phytozome API?
Best!