LLNL / merlin

Machine Learning for HPC Workflows
MIT License
118 stars 26 forks source link

feature/queue info #461

Closed bgunnar5 closed 7 months ago

bgunnar5 commented 7 months ago

This is the third PR related to the new status commands. This one includes the queue-info command which contains the functionality for the previous status command prior to this refactor, plus some additional features.

The queue-info command will query all active queues by default. Almost all of the options that you can provide this command revolve around refining this query.

The next PR for this will be the final one and will just be documentation for all 3 commands. I'll likely put that one in next week at the same time as this one is getting reviewed.

bgunnar5 commented 7 months ago

hmmm not sure why that's failing. I'll investigate more on Monday

bgunnar5 commented 7 months ago

@lucpeterson @koning @doutriaux1 This one is ready for review now. I was seeing some weird lint errors but turns out it was due to GitHub running python 3.12 and locally I was running 3.10

koning commented 7 months ago

Do you have doc updates for this? i.e. merlin status yaml is now merlin queue-info --spec yaml

bgunnar5 commented 7 months ago

@koning working on them at the moment. I'll open a new PR with all of the docs for status, detailed-status, and queue-info when I'm finished with them

koning commented 7 months ago

OK, please add a definition of "active queue" in those docs.

bgunnar5 commented 7 months ago

I was thinking "active queues" would mean any queues with workers watching them. However, I think it should also incorporate any queues that have tasks in them, even if no workers are watching them. I may need to add some additional functionality to this PR to make that possible.

bgunnar5 commented 7 months ago

I was thinking "active queues" would mean any queues with workers watching them. However, I think it should also incorporate any queues that have tasks in them, even if no workers are watching them. I may need to add some additional functionality to this PR to make that possible.

@koning I did some more digging on this and currently Celery's active_queues method (which is what we're using to get "active queue" information) defines active queues as queues that workers are watching. To stay consistent with Celery, do you think it would be best for our definition of "active queues" to match theirs?

koning commented 7 months ago

Yes, that would be good, just need a definition for the users.