Open btravouillon opened 5 years ago
I think running lnetctl lnet configure
+ lnetctl import <configured file>
after module load and runing lnetctl lnet unconfigure
before module unload might make more sense.
The lnet.service really is too far from how shine expects the system to be configured, but having an /etc/lnet.conf would be much more flexible than kernel module parameters.
I think both are doable.
Supporting lnetctl import /etc/lnet.conf
is definitely something useful that Shine should support.
Delegating the modules/router supports to external scripts is fine to me, as an optional step. Relying on module_unload=false
feature should able to achieve that? We need to update the current patch to disable StartRouter/StopRouter or add additional flags
pushed https://review.gerrithub.io/c/cea-hpc/shine/+/468899 as a draft, 100% untested code - will work on that tomorrow morning if life allows, but comments on overall architecture are welcome earlier (EDIT: didn't go for external script but that'd work for me too, happy to change what I started with in that direction)
I'm using the lnet.service and /etc/lnet.conf to configure the LNet on my servers and clients:
This service loads the lnet module, configure the lnet, then import the /etc/lnet.conf.
shine stop reports an error while trying to remove the Lustre modules from the kernel:
It would need to unconfigure the lnet before trying to remove the lnet module from the kernel.
The simpler solution would be to stop unloading the modules when running shine stop. :-) I can rebase and enhance https://review.gerrithub.io/c/cea-hpc/shine/+/367989
Then we could plan to add support for the lnet.service if you believe this is worthwhile.