Open tristanpwdennis opened 3 months ago
Check out this pull request on
See visual diffs & provide feedback on Jupyter Notebooks.
Powered by ReviewNB
Hey Tristan. Nice work!
Ill save comments for now but FYI - when you add notebooks to malariagen_data, make sure you have cleared all outputs, otherwise they can become quite hefty in size and then the repo balloons in size over time (all of it is stored in git history).
I've found the source of the AssertionError (also see issue #516) - something to do with how dask.array.map_blocks
computes variant_allele
at line 1629 of snp_data.py
.
I haven't managed to get to the bottom of it yet but in this PR there's a temporary fix that just applies apply_allele_mapping
to an in-memory np array of variant_allele
, and I've now added biallelic_snp_calls
to to_plink.py
instead of calling snp_calls
and thinning them manually.