googlegenomics / variant-annotation

Use cloud technology to annotate human sequence variants in parallel.
Apache License 2.0
11 stars 7 forks source link

Use gcsfuse to deliver VEP cache #6

Open Jessime opened 6 years ago

Jessime commented 6 years ago

We currently download the entire VEP cache of interest into every VM/instance, which takes a significant amount of time. When running smaller test sets, the downloading and unzipping likely accounts for most of the "annotation" time. gcsfuse might be able to alieviate this problem.

On the other hand, there may be some relative latency per variant, which could accumulate significantly on larger data sets. Performance testing would have to be done.