malariagen / malariagen-data-python

Analyse MalariaGEN data from Python
https://malariagen.github.io/malariagen-data-python/latest/
MIT License
13 stars 23 forks source link

Biallelic SNP calls and diplotypes #468

Closed alimanfoo closed 8 months ago

alimanfoo commented 8 months ago

Adds a new method biallelic_snp_calls() which builds a dataset of SNP calls at sites which are biallelic within the selected samples. Note these do not necessarily have to include the reference allele, i.e., sites are included if there are only two alleles observed, regardles of which alleles.

Adds a new method biallelic_diplotypes() which computes alternate allele counts per genotype call, generating a 2-dimensional array suitable for use with PCA, neighbour-joining trees, admixture, etc.

Refactors pca() to use biallelic_diplotypes() internally.

N.B., because the set of input SNPs to PCA will be slightly different as a result of this change, I have bumped the results cache version number for the pca() function. Also the next release of malariagen_data should be a major version bump I would suggest - the API has not changed so code will still be backwards compatible, but analysis results could change slightly and so it would probably help to communicate that.

review-notebook-app[bot] commented 8 months ago

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

codecov[bot] commented 8 months ago

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Comparison is base (356c1e7) 97.51% compared to head (c83aa89) 98.06%. Report is 2 commits behind head on master.

Additional details and impacted files ```diff @@ Coverage Diff @@ ## master #468 +/- ## ========================================== + Coverage 97.51% 98.06% +0.55% ========================================== Files 26 26 Lines 2092 2170 +78 ========================================== + Hits 2040 2128 +88 + Misses 52 42 -10 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

alimanfoo commented 8 months ago

Going to merge but happy to follow up if any questions or suggestions.