googlegenomics / gcp-variant-transforms

GCP Variant Transforms
Apache License 2.0
135 stars 55 forks source link

Sharding config with "Partition Range Start" #641

Open samanvp opened 4 years ago

samanvp commented 4 years ago

Currently our sharding config file only contains "Partition Range End". This means we always partition BigQuery tables by using 0 for start point. This is not an issue of our current default sharding config where variants of each chromosome is assigned to a separate table. However, if config file includes multiple shard per chromosome (such as this one) we can improved the performance of partitioning by including the start position.