opentargets / orchestration

Open Targets pipeline orchestration layer
Apache License 2.0
1 stars 0 forks source link

fix(variant_to_vcf): increase batch job size #46

Closed project-defiant closed 1 month ago

project-defiant commented 1 month ago

The gs://ot_orchestration/releases/24.10_freeze2/ produced VariantIndex dataset with ~ 2mln variants, which was aorund half less then expected - see variant to vcf batch job.

image

The cause of the credible_set failing the variant_to_vcf step was due to the Out Of Memory Exception thrown by spark session - see OOM Error

fix on VM heap size causing incomplete variant_index - variants_to_vcf was failing due to heap space.

After increasing the driver memory, the run was successful - variant_to_vcf

Things implemented: