Open htejun opened 2 months ago
Is it still available ? If so I would love to help
The draft PR is ready but I've encountered a problem which might need some help. The machine I'm running on now is using AMD Ryzen 7 5700X3D 8-Core Processor and under rusty, it has only 1 NUMA node and 1 domain, so the load balancing step would be stopped before it can actually do anything. That way I can't test whether my change has made some improvement or not, maybe someone would be so kind to test the PR for me ?
If that would do I'll send a draft PR first and see what's the testing result, otherwise I'll try to think of other ways to test it.
You can use -C
option to define arbitrary LB domains, which should be sufficient for testing.
You can use
-C
option to define arbitrary LB domains, which should be sufficient for testing.
Thank you ! I can test it now and found some problems, I'll figure it out and send a PR later.
Right now, after the userland loadbalancer makes migration decisions (move task X to domain N), the decision is recorded in the
lb_data
map which is a map frompid_t
to destination domain number. Then, on the enqueue path,rusty_enqueue()
checks whether the task has a matching entry inlb_data
and if so executes the requested migration. This means that the application of LB decisions isn't reliable - it depends on the task being migrated running in the following period. Otherwise, the decision is ignored.While this works okay in practice as the LB just keep retrying until the domains are balanced, this makes the behavior less predictable. It'd be great to make the load balancing decisions executed reliably. Maybe
test_run
can be used to execute migrations immediately - seeset_power_profile()
inscx_lavd
for an example.