PGScatalog / pygscatalog

Python applications and libraries for working with PGS data and the PGS Catalog
https://pygscatalog.readthedocs.io/en/latest/
Apache License 2.0
6 stars 1 forks source link

Fix variant deduplication #38

Closed nebfield closed 2 months ago

nebfield commented 2 months ago

Looking at the coverage reports, the deduplication function hasn't been correctly called since the refactor from the old pgscatalog-utils python package (it was accidentally overlooked)

Without this patch, duplicated variants IDs are written to scoring files. When newer versions of plink2 use these variants, a warning is emitted:

n --score file entry was skipped due to a missing variant ID

Duplicate variant IDs used to trigger an error 🤔

For context, this is a problem that happens when combining many scores in parallel which share the same variant ID but have different effect alleles. The correct behaviour should be to split these variants across different scoring files.

codecov[bot] commented 2 months ago

Codecov Report

Attention: Patch coverage is 94.44444% with 1 line in your changes missing coverage. Please review.

Project coverage is 87.89%. Comparing base (8fb9c7f) to head (d9c083e).

Files Patch % Lines
...log.match/src/pgscatalog/match/lib/_match/plink.py 94.44% 1 Missing :warning:
Additional details and impacted files ```diff @@ Coverage Diff @@ ## fix-ambig #38 +/- ## ============================================= + Coverage 86.60% 87.89% +1.28% ============================================= Files 20 20 Lines 1038 1041 +3 ============================================= + Hits 899 915 +16 + Misses 139 126 -13 ``` | [Flag](https://app.codecov.io/gh/PGScatalog/pygscatalog/pull/38/flags?src=pr&el=flags&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=PGScatalog) | Coverage Δ | | |---|---|---| | [pgscatalog.match](https://app.codecov.io/gh/PGScatalog/pygscatalog/pull/38/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=PGScatalog) | `87.89% <94.44%> (+1.28%)` | :arrow_up: | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=PGScatalog#carryforward-flags-in-the-pull-request-comment) to find out more.

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.