PGScatalog / pygscatalog

Python applications and libraries for working with PGS data and the PGS Catalog
https://pygscatalog.readthedocs.io/en/latest/
Apache License 2.0
6 stars 1 forks source link

Fix combine CLI producing empty output with invalid data #56

Closed nebfield closed 5 days ago

nebfield commented 1 week ago

When processing a single file, if the combine CLI encountered an invalid variant it would quietly fail to write out any variants (quiet except for a misleading log statement). Some investigation notes:

Closes #55

Test results

22 (older) scoring files contain invalid rsIDs:

pgscatalog.core.cli.combine_cli: 2024-10-18 14:43:12 CRITICAL PGS000019 contains invalid data, stopping and exploding
pgscatalog.core.cli.combine_cli: 2024-10-18 14:52:23 CRITICAL PGS000042 contains invalid data, stopping and exploding
pgscatalog.core.cli.combine_cli: 2024-10-18 15:09:03 CRITICAL PGS000212 contains invalid data, stopping and exploding
pgscatalog.core.cli.combine_cli: 2024-10-18 15:09:03 CRITICAL PGS000213 contains invalid data, stopping and exploding
pgscatalog.core.cli.combine_cli: 2024-10-18 15:09:03 CRITICAL PGS000214 contains invalid data, stopping and exploding
pgscatalog.core.cli.combine_cli: 2024-10-18 15:09:03 CRITICAL PGS000215 contains invalid data, stopping and exploding
pgscatalog.core.cli.combine_cli: 2024-10-18 15:09:03 CRITICAL PGS000216 contains invalid data, stopping and exploding
pgscatalog.core.cli.combine_cli: 2024-10-18 16:26:04 CRITICAL PGS000310 contains invalid data, stopping and exploding
pgscatalog.core.cli.combine_cli: 2024-10-18 16:26:04 CRITICAL PGS000311 contains invalid data, stopping and exploding
pgscatalog.core.cli.combine_cli: 2024-10-18 16:26:06 CRITICAL PGS000317 contains invalid data, stopping and exploding
pgscatalog.core.cli.combine_cli: 2024-10-18 16:36:01 CRITICAL PGS000330 contains invalid data, stopping and exploding
pgscatalog.core.cli.combine_cli: 2024-10-18 16:46:39 CRITICAL PGS000332 contains invalid data, stopping and exploding
pgscatalog.core.cli.combine_cli: 2024-10-18 16:46:39 CRITICAL PGS000333 contains invalid data, stopping and exploding
pgscatalog.core.cli.combine_cli: 2024-10-18 16:48:31 CRITICAL PGS000344 contains invalid data, stopping and exploding
pgscatalog.core.cli.combine_cli: 2024-10-18 16:48:31 CRITICAL PGS000345 contains invalid data, stopping and exploding
pgscatalog.core.cli.combine_cli: 2024-10-18 16:48:31 CRITICAL PGS000346 contains invalid data, stopping and exploding
pgscatalog.core.cli.combine_cli: 2024-10-18 16:48:31 CRITICAL PGS000347 contains invalid data, stopping and exploding
pgscatalog.core.cli.combine_cli: 2024-10-18 19:14:11 CRITICAL PGS000727 contains invalid data, stopping and exploding
pgscatalog.core.cli.combine_cli: 2024-10-18 19:15:24 CRITICAL PGS000728 contains invalid data, stopping and exploding
pgscatalog.core.cli.combine_cli: 2024-10-18 19:15:24 CRITICAL PGS000729 contains invalid data, stopping and exploding
pgscatalog.core.cli.combine_cli: 2024-10-18 19:22:39 CRITICAL PGS000754 contains invalid data, stopping and exploding
pgscatalog.core.cli.combine_cli: 2024-10-18 19:33:12 CRITICAL PGS000867 contains invalid data, stopping and exploding

Fix is to relax the rsID check when harmonisation goes wrong. No other ValidationErrors get thrown.

codecov[bot] commented 1 week ago

Codecov Report

Attention: Patch coverage is 98.30508% with 1 line in your changes missing coverage. Please review.

Project coverage is 89.61%. Comparing base (15f7095) to head (1566a89). Report is 4 commits behind head on main.

Files with missing lines Patch % Lines
pgscatalog.core/src/pgscatalog/core/lib/models.py 94.73% 1 Missing :warning:
Additional details and impacted files ```diff @@ Coverage Diff @@ ## main #56 +/- ## ========================================== + Coverage 89.37% 89.61% +0.24% ========================================== Files 54 54 Lines 3434 3475 +41 ========================================== + Hits 3069 3114 +45 + Misses 365 361 -4 ``` | [Flag](https://app.codecov.io/gh/PGScatalog/pygscatalog/pull/56/flags?src=pr&el=flags&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=PGScatalog) | Coverage Δ | | |---|---|---| | [pgscatalog.core](https://app.codecov.io/gh/PGScatalog/pygscatalog/pull/56/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=PGScatalog) | `92.57% <98.30%> (+0.45%)` | :arrow_up: | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=PGScatalog#carryforward-flags-in-the-pull-request-comment) to find out more.

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.