Closed corneliusroemer closed 1 year ago
Thanks for the feedback, @corneliusroemer.
@AngieHinrichs are we using usher-sampled-server
for the web version now? That should significantly speed up the tool by preloading the MAT and placing multiple samples in parallel.
It's a pity that Usher is so slow even if it just needs to extract a subtree - like when I upload two EPI_ISLs that should already be on the main tree.
Naively, I'd expect that all that's required is: a) find where that sequence is on the tree (shouldn't be too hard with lookup in an index) and b) preparing some stats and nextstrain output.
A fast version of Usher could work on a prebuilt tree only, not doing any placement. Basically just querying. That way you could prepare a lot of the steps. When a request comes in, all you have to do is do a lookup and output a simple Nextrain JSON.
That would make Usher so much more useful. Maybe I'm just particularly impatient compared to average users, but waiting 5 minutes to see where a single pre-existing sequence in the tree lies hurts 😬
Getting from here:
to here should take seconds not minutes:
Since both fasta sequence placement and name lookup have been sped up, can we close this now? 😄
Yup. This is resolved.
Usher is such an amazing tool. Unfortunately, the run time of analyses using the phyloplace website are quite long. So long that I don't use Usher as much as I should.
To place say 50 sequences with 1000 context it does run for ~5min or more.
Does the number of context samples make it run much longer? Are there ways I can get it to be faster?
I'm curious what is the bottleneck for the analyses. I should probably try to run Usher locally to see whether this is faster and feasible for my use case.
If tree size is a problem: I'm only really interested in BA.2/4/5/.75 right now - have you considered making a version that ignores old stuff that's no longer relevant?
I do have the feeling that runtimes used to be quite a bit faster in the past.