Next time the pipeline needs updates, we should probably convert all steps that follow .xml parsing to a Spark-based hail pipeline.
Currently, the steps that generate clinvar x gnomAD tables take hours to run, so I skipped them for the latest release - hail would be able to perform these joins much more efficiently.
Next time the pipeline needs updates, we should probably convert all steps that follow .xml parsing to a Spark-based hail pipeline.
Currently, the steps that generate clinvar x gnomAD tables take hours to run, so I skipped them for the latest release - hail would be able to perform these joins much more efficiently.