Open andreaswallberg opened 21 hours ago
Hi Andreas, Our Academy has some content about using annotations in this manner: https://cloud.tiledb.com/academy/structure/life-sciences/population-genomics/tutorials/advanced/annotations/
We have some solutions for annotating TileDB-VCF datasets in TileDB Cloud in a N+1 manner, so that only newly encountered variants are annotated. These produce external arrays.
Dear developers,
I wonder if it is possible to export SNPs for loci or regions based on annotations derived from the original VCF.
For example, exporting all SNPs with missense variants for gene X.
If not, I would like to request that feature, which could be very useful in day-to-day work.
A connected issue is: how does SNP data actually relate to annotations?
Lets say I ingest a VCF that are annotated with gene annotations v1.0 in a first batch. At a later stage, I ingest a VCF containing new samples but with refined annotations, i.e. v2.0, in a second batch. What happens to the annotations at this point, how do they relate to each batch and how can I interact with them?
Can I re-annotate SNPs that are shared between the two batches? Moreover, can I update annotations without ingesting new samples?