populationgenomics / saige-tenk10k

Hail batch pipeline to run SAIGE on CPG's GCP
MIT License
0 stars 0 forks source link

saige_assoc.py review #74

Open katiedelange opened 6 months ago

katiedelange commented 6 months ago

Some points to consider in saige_assoc.py

annacuomo commented 6 months ago

Not all of the SAIGE parameters are self-explanatory, and I can't find explanations for everything in the docs yet. Could you add some details to the README for anything that we might expect to have to vary (e.g. what is a reasonable value range for tol and how might I go about selecting that, or am I never going to want to change it?)

This is an excellent point, I'll add some of these in the SAIGE-QTL docs but then also in the readme here (PR: https://github.com/populationgenomics/saige-tenk10k/pull/75)!

Also as a related aside I just realised that most of these are listed in the code but not actually customisable! I added a couple for debugging purposes recently (https://github.com/populationgenomics/saige-tenk10k/blob/real-association/saige_assoc.py#L355-L356), but I think this is not ideal.

I think lint might complain if I add seven million arguments / paraments with click, so maybe I should add some sort of config file instead? I've added this to my TO DO list.

annacuomo commented 6 months ago

I assume this subsetting needs to be removed eventually

@katiedelange yup this is just for testing :)

annacuomo commented 4 hours ago

@katiedelange back to checking these issues you left a while back! I think this one can be closed now as your main points have been addressed?

let me know if you have any other comments