populationgenomics / production-pipelines

Genomics workflows for CPG using Hail Batch
MIT License
2 stars 0 forks source link

Enable CPG ID pre-commit hook #741

Closed vivbak closed 1 month ago

vivbak commented 1 month ago

The majority of the changes are trivial, existing only in test files and documentation.

There is one functional change required however that I have not yet addressed.

This line is a functional screen so I can't just switch it.

@MattWellie, how often do you use this script? Do you have thoughts on either A) Pulling out the borked SG IDs into an argument for this script B) Creating a cohort that excludes these two SG IDs.

MattWellie commented 1 month ago

Thrilling Sunday evening you're having 🤓

vivbak commented 1 month ago

Pending this https://github.com/populationgenomics/production-pipelines/pull/742

vivbak commented 1 month ago

TIL LCL is an abbreviation for local 😂

In my head that was the conversation in the metamist repo, not prod pipes. The tests there functionally rely on the CPG Id being a valid output -- for example validating the lunh checksum so CPGAAA doesn't work. But I don't believe it's an issue here because we assume CPG IDs are valid in prod pipes.

Though I don't have any issue keeping in consistent. Merging for now but happy to open up another PR later if we want to keep the same convention here.

jmarshall commented 1 month ago

That's what I assume it stands for 🤷

Yep, fair enough; and no biggie. There's some in _test/testcohort.py which may or may not have their luhns enforced.