Closed charlesfoster closed 1 year ago
Hi Charles-
I'm not the right one to speak to the details of RIPPLES or the implications of your downsampling procedure on its results, but I can say that you can probably have simpler options than 4 matUtils calls. If I understand correctly, you're interested in examining all the clades descended from your putative recombinant? You may want to try matUtils extract -C
, which will yield the set of all clades that are in your subtree (along with mutations defining them), to avoid calling summary
.
Alternatively, if you're willing to do some scripting in Python, you can avoid repeated matUtils calls altogether by using BTE, our Python API. It would be straightforward to write a script that finds the matching node for each entry in your recombinant output table and computes the downstream clade membership, while only requiring loading the tree once overall.
In addition to what Jakob mentioned, I would suggest you use ripples-fast
, which produces the same results ripples
, but is a lot faster. Also, are you aware of RIVET (http://rivet.ucsd.edu/)? It can be used to automate your pipeline and also visualize your recombinants using instructions provided in https://github.com/TurakhiaLab/rivet.
Thank you both for your replies. I'll check out using matUtils extract -C
, ripples-fast
, and rivet
. BTE also sounds useful, thanks.
Hi,
I've been trying out using
ripples
for exploratory tasks. Basically, I was hoping that if I come across strange samples I suspect of being recombinant that I could place them on the global tree usingusher
, downsample the tree to a representative set of samples per clade usingmatUtils
( to save on computational time, while also keeping the new sequences), then runripples
. I would then parse the results to find clades with solid evidence of being recombinant and see if the new samples fall within those clades. Finally, I would check the lineages of the putative donor and acceptor clades to try and narrow down the donor/acceptor lineages.Does this seem reasonable, or am I misusing/misunderstanding the purpose of these tools? Is there a smarter way to extract the information I want at the end instead of 2x
extract
and 2xsummary
commands?Apologies for these basic questions!
Thanks, Charles