monarch-initiative / pyphetools

Python Phenopacket Tools
https://monarch-initiative.github.io/pyphetools/
MIT License
9 stars 1 forks source link

Setup CI #85

Open ielis opened 4 months ago

ielis commented 4 months ago

Setup CI to ensure we don't break generation of the existing phenopacket cohorts.

The CI should pull the phenopacket store and generate the phenopackets from the inputs

cmungall commented 1 month ago

Looks like currently the CI runs the main unit tests but not yet integration testing against the cohorts

Let us know if you want help with this, my group has a lot of experience with these kind of sometimes-awkward inter-repo CI jobs

justaddcoffee commented 1 month ago

+1 - it would be doable to knock up a Jenkins pipeline for this and run it on build.berkeleybop.io to generate phenopackets

ielis commented 1 month ago

I would love to see this CI. However, I anticipate one hurdle that we would have to overcome.

pyphetools and phenopacket-store have been developed hand in hand, and the notebooks for generating some cohorts may not be compatible with the latest pyphetools anymore. I suspect that many notebooks will fail, unless run with a particular pyphetools version.

Please note that there is nothing wrong with the actual phenopackets - they meet all Q/C requirements. However, we did not plan to run the notebooks with any new pyphetools feature, mostly to simplify creating the repo.

If this does not sound like a deal breaker, I'd be happy to assist with setting up the CI.


One more note - adding this CI also sounds a little bit dangerous. I would love to see the notebooks pass before all feature PRs, but running dozens notebooks takes a lot of time, because it includes a lot of REST API overhead (e.g. to verify variant coordinates by Variant Validator), which may be impractical. What if one or two notebooks time out? :smirk_cat: :roll_eyes:

cmungall commented 1 month ago

One more note - adding this CI also sounds a little bit dangerous. I would love to see the notebooks pass before all feature PRs, but running dozens notebooks takes a lot of time, because it includes a lot of REST API overhead (e.g. to verify variant coordinates by Variant Validator), which may be impractical. What if one or two notebooks time out?

What about purely structural validation as part of GitHub CI actions, for anything that involves external calls this could be done via something like our Jenkins server, as Justin suggests (but I appreciate this is more moving pieces than purely GitHub based checks).

On Mon, May 27, 2024 at 1:38 PM Daniel Danis @.***> wrote:

I would love to see this CI. However, I anticipate one hurdle that we would have to overcome.

pyphetools and phenopacket-store have been developed hand in hand, and the notebooks for generating some cohorts may not be compatible with the latest pyphetools anymore. I suspect that many notebooks will fail, unless run with a particular pyphetools version.

Please note that there is nothing wrong with the actual phenopackets - they meet all Q/C requirements. However, we did not plan to run the notebooks with any new pyphetools feature, mostly to simplify creating the repo.

If this does not sound like a deal breaker, I'd be happy to assist with setting up the CI.

One more note - adding this CI also sounds a little bit dangerous. I would love to see the notebooks pass before all feature PRs, but running dozens notebooks takes a lot of time, because it includes a lot of REST API overhead (e.g. to verify variant coordinates by Variant Validator), which may be impractical. What if one or two notebooks time out? 😼 🙄

— Reply to this email directly, view it on GitHub https://github.com/monarch-initiative/pyphetools/issues/85#issuecomment-2134026273, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAMMOMGOF267P5JQ2A7GTTZEOKVLAVCNFSM6AAAAABEHONY5GVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMZUGAZDMMRXGM . You are receiving this because you commented.Message ID: @.***>

ielis commented 1 month ago

What do you mean by "purely structural validation"? Is this checking that the notebook cells compile without running them?

Or do you want to validate the phenopacket JSON files? This is actually not too hard with phenopacket-tools validate command.