flux-framework / flux-sched

Fluxion Graph-based Scheduler
GNU Lesser General Public License v3.0
84 stars 39 forks source link

Update match stats and implement stats-clear #1168

Closed milroy closed 1 month ago

milroy commented 1 month ago

Gathering match data is important for diagnosing performance problems in Fluxion. This PR updates the stat RPC topic to respond to flux-core as expected and implements clear () functionality.

Resolves issue #1166.

garlick commented 1 month ago

Might want to include a test that the RPC works with the flux-core tooling (if it's already there and I missed it sorry!)

$ flux module stats sched-fluxion-resource
$ flux module stats --clear sched-fluxion-resource
milroy commented 1 month ago

Might want to include a test that the RPC works with the flux-core tooling (if it's already there and I missed it sorry!)

There are tests that run flux module stats sched-fluxion-resource (they don't check the output, though), but nothing runs --clear. This functionality doesn't seem to warrant creating a new sharness test, but it's not immediately obvious which test I could add it to. Maybe t4000-match-params.t?

garlick commented 1 month ago

Ah OK, probably fine then! For clear, the topic string either matches or it doesn't so ti's a pretty low risk thing for a regression!

milroy commented 1 month ago

Also, does anyone have insight into how to deal with the pylint test failure?

src/cmd/flux-ion-resource.py:436:0: R0915: Too many statements (51/50) (too-many-statements)

It suggests that I need to rewrite the main function to split it up, but it's not obvious how to do that given it's dealing with a large number of arguments. Also, I can't reproduce the error in the bookworm CI image.

grondo commented 1 month ago

I'd just disable the check for that function:

# pylint: disable=too-many-statements

I think that comment goes right above the affected function.

milroy commented 1 month ago

Ah OK, probably fine then! For clear, the topic string either matches or it doesn't so ti's a pretty low risk thing for a regression!

I thought it was best to add a test_expect_success test to check for both.

milroy commented 1 month ago

Oh one more thing - stats is usually made available to guests without restriction, unless there is something exposed there that could be a problem. It is just one less thing that requires you to sudo to access.

I made your suggested change, but I didn't need sudo to run stats before.

garlick commented 1 month ago

I made your suggested change, but I didn't need sudo to run stats before.

Thanks! Would only be needed when running multi-user (where the user running flux is not you).

milroy commented 1 month ago

Thanks for the feedback @garlick! Setting MWP.