viralemergence / virion

The Global Virome in One Network
https://viralemergence.github.io/virion
37 stars 8 forks source link

would vdict() and hdict() be faster if you passed names directly to classification()? #19

Closed cjcarlson closed 3 years ago

cjcarlson commented 3 years ago
>library(microbenchmark)
>m1 <- microbenchmark(classification("Thamnophis", db = "ncbi"))
>m2 <- microbenchmark(classification(get_uid("Thamnophis"), db = "ncbi"))
> m1
Unit: milliseconds
                                      expr      min       lq     mean   median       uq      max neval
 classification("Thamnophis", db = "ncbi") 358.7079 414.3232 662.1527 467.1497 565.5245 7656.445   100
> m2
Unit: milliseconds
                                               expr      min       lq     mean   median       uq      max
 classification(get_uid("Thamnophis"), db = "ncbi") 356.7234 402.9739 593.5077 453.5109 599.0573 2089.294
 neval
   100

Doesn't seem like it from this - BUT, maybe it would be if you're passing the huge lists?

cjcarlson commented 3 years ago

I'm happy with this as part of a working pipeline. In theory, I think the point is that it'll be faster to switch to jncbi long term anyway