We want to be able to run the harmonisation and susie finemapping batch job within the orchestration for ukb_ppp_eur data.
This PR summarizes the developments over the processing of the
ukb_ppp_eur_ingestion (harmonisation + locus breaker) that results in the study_locus dataset partitioned by studyLocusId (run via dataproc)
ukb_ppp_eur_finemapping (finemapping manifest generation + susie finemapping) that results in the credible_set dataset partitioned by studyLocusId. (run via google batch)
To run the harmonisation, some steps needs to be pre-executed before. I have described these steps in the docs along with the data structure.
Context
We want to be able to run the harmonisation and susie finemapping batch job within the orchestration for ukb_ppp_eur data.
This PR summarizes the developments over the processing of the
To run the harmonisation, some steps needs to be pre-executed before. I have described these steps in the docs along with the data structure.
The overall batch job hit some limits over the