tweag / chainsail

Replica Exchange sampling as-a-service
MIT License
11 stars 1 forks source link

Add poetry2nix application for scheduler #446

Closed steshaw closed 1 year ago

steshaw commented 1 year ago

Here we find the necessary poetry2nix overrides necessary to build a Poetry application for the scheduler.

Also included a fix for https://github.com/tweag/chainsail/issues/455 so that I can test the scheduler in Minikube.

steshaw commented 1 year ago

In principle looks good to me, but I'm wondering whether you tested the scheduler application to some extent?

No, I didn't know how and so just relied on getting a green light from CI. I'll try to get a full k8s environment going so that I can get an end-to-end test working. It's good to be cautious because, with dynamically-typed languages, dependency upgrades like this are much more likely to break things.

steshaw commented 1 year ago

Note that I've rebased on changes that occurred in https://github.com/tweag/chainsail/pull/443.

simeoncarstens commented 1 year ago

No, I didn't know how and so just relied on getting a green light from CI. I'll try to get a full k8s environment going so that I can get an end-to-end test working. It's good to be cautious because, with dynamically-typed languages, dependency upgrades like this are much more likely to break things.

Nice, thanks! Yes, we have to be extra-cautious with the controller, because the Flask app and the Celery tasks unfortunately don't have tests at the moment. Furthermore, the whole interplay between Celery / Redis / Flask app and the uwsgi web server setup are not covered by tests either, and I noticed #437 only when the Docker container wouldn't start.

steshaw commented 1 year ago

@simeoncarstens, now that I've included a fix for #455, I've been able to test that the scheduler starts in Minikube. This is ready to merge if you're still happy with it.

simeoncarstens commented 1 year ago

That's excellent news, @steshaw! I would feel more at ease, though, if we were sure that also, e.g. Celery tasks work. I don't see a reason why they shouldn't, though, but just to be safe, we could wait until we fixed #449 (or temporarily hack a way around it), which should be all that's required for testing the Nix-packaged apps within the full service deployment.

steshaw commented 1 year ago

I can confirm that I was able to test an end-to-end workflow locally with Minikube (which I assume exercises Celery tasks). To start the frontend dev server for the E2E workflow, I temporarily downgraded the nixpkgs pin to nixpkgs/21.11. I was able to create, start, and finish a job using the "mixture.zip" in the UI. I'm still experimenting with fixes to #449.

If you're happy to merge this on that basis. Perhaps you could then look at https://github.com/tweag/chainsail/pull/447 which is stacked on this PR.