Closed NucciTheBoss closed 4 months ago
Hello @NucciTheBoss! Thank you for that! I see no problem with taking the approach you suggested. What do you think @jamesbeedy?
@NucciTheBoss thank you for providing this thorough explanation. I vote for option #2 due to the fact that the version will be available immediately after install and will not depend on whether the service can start up or not.
Thanks for getting back to me folks! Sounds good James - I prefer option 2 as well. Currently I am working on the migration of slurmdbd to MySQL Innodb and MySQL charms, but I should be able to get to this starting tomorrow and potentially have something to submit Wednesday before our meeting.
Hey, @NucciTheBoss and @dvdgomez 👋
We are trying to get these charms released to edge with a version file for now (as they are currently broken in edge). I will advocate using the dnf/apt libraries to get the slurm version and set that as the workload version when the dnf lib is ready.
That is fine by me for the moment. I saw that you added an extra step to the pipeline anyways :smile:
We should hopefully have something more concrete after I do some more work into supporting the different bases. Control plane we can get away with just using Ubuntu, but with the new DNF library I proposed in https://github.com/canonical/operator-libs-linux/pull/75 we should be able to get it where using apt/dnf will work for operators like slurmd.
Enhancement Proposal
Overview
Hello there folks! I am writing this enhancement proposal because I would like to propose a change to how the slurm charms set their workload version given the issues that we encountered last week. Rather than using
git describe ...
to create a version file and then using that version file to set the workload version during install, I propose that we set the workload version of the charm to the version of the SLURM service running on the Juju unit.Reason 1
Per John Meinel (Director of Engineer for Juju) guidance, the workload version of the charm should be used to tell the human operator what version of the service you are running, not the version of the charm's source code:
The version file seems to be mostly for leaving a "signature" of what version of the charm that you are running. Best practice seems to be that the workload version is the version of the service that has been charmed. In the case of SLURM, setting the workload version to the version of the underlying SLURM service makes it more clear to the human operator which version of SLURM they are deploying.
Reason 2
Since the version file is not automatically created by
charmcraft
when packing the SLURM charms and it is required to successfully install SLURM duringInstallEvent
handling, the version file can cause the SLURM charms to break for "self-packing" administrators who are not familiar with the internals of the SLURM charms. If we set the workload version to version of the SLURM service running on the unit, we do not have worry about whether or not the version file was packed too.Implementation
There are two possible ways we could implement setting the workload version of the units using the SLURM service version rather than the version file:
{slurmctld,slurmd,slurmdbd,slurmrestd} -V
to retrieve the version of the SLURM service.version
attribute provided by the apt charm library from operator_libs_linux.One thing to note about option 1 is that the workload version would have to be set right before we start the SLURM service.
{slurmctld,slurmd,slurmdbd,slurmrestd} -V
seems to only work if everything is in order with the respective configuration file.Closing comments
Let me know what your thoughts are on changing how we set the workload version of the Juju units for the SLURM charms. I think this will help resolve issues with forgetting to pack the version file and prevent it from being the reason why SLURM fails to install on freshly deployed units.