LLNL / merlin

Machine Learning for HPC Workflows
MIT License
118 stars 26 forks source link

bugfix/monitor-shutdown #452

Closed bgunnar5 closed 8 months ago

bgunnar5 commented 10 months ago

This branch has the fix for the merlin monitor command so that it won't shut down an allocation if workers are still processing tasks for the spec that's given. This also required me to fix the current merlin status command so that it no longer has inconsistencies when using Redis as the broker.

bgunnar5 commented 10 months ago

This branch is pretty much ready for review now but I think it needs tests. I'm working on refactoring the integration test suite at the moment so I'll likely leave this as a draft until that refactor is merged. The refactor will (hopefully) make writing tests much easier, especially for things like this.

koning commented 10 months ago

Great, I'll look at this

lucpeterson commented 10 months ago

Are there some tests we could make for this?

bgunnar5 commented 10 months ago

@lucpeterson I'm working on that at the moment. On a different branch I'm adding pytest fixtures for things like starting a redis server, creating a celery app, and launching workers to our integration test suite which will make writing integration tests more modular

bgunnar5 commented 8 months ago

@lucpeterson @koning @doutriaux1 Does this look ok to merge now?