Open dotinspace opened 7 months ago
Hi @dotinspace,
This error looks like a data dependent edge case related to the AlleleCount
and VariantStats
stats having 0 records.
To work around the issue, please create the dataset with AlleleCount
and VariantStats
disabled:
tiledbvcf create --disable-allele-count --disable-variant-stats ...
If you can share the VCF file ingested, it would help us debug the issue (I know that is not always possible). Otherwise, we will try to reproduce the condition that causes this error.
Hi, thanks for the swift response.
The multisample VCF, ALL.chr1.phase3_shapeit2_mvncall_integrated_v5b.20130502.genotypes.vcf.gz
, was taken from 1000Genomes. Then split with vcf-split, and subsequently block compressed (bgzip) and indexed (tabix), before being ingested into TileDB-VCF dataset. I wouldn't be surprised if the VCF files, or the process of splitting, might cause some issue with those two stats arrays. Unfortunately, for our purposes, currently, we are testing by utilising variant_stats
.
Anyway, nice to know what is going on for future reference.
Hi! I am trying to ingest samples, but run into the following problem:
As far as I can tell, one of the following steps fails. Is there something I can test tweaking to make this work, or does anything else stand out as the obvious culprit here?