openhpc / ohpc

OpenHPC Integration, Packaging, and Test Repo
http://openhpc.community
Apache License 2.0
840 stars 185 forks source link

rocky 9 + warewulf + slurm fails on pristine system #1943

Closed luciandf closed 4 months ago

luciandf commented 4 months ago

Hello all,

I am trying to use the v3.x recipe for Rocky9+warewulf+slurm but I am having trouble setting up the compute nodes section. Everything is going smoothly up until the wwsh file import /etc/passwd where I get these errors:

Complete!
WARNING:  Caught error on module load:  Can't locate sys/ioctl.ph in @INC (did you run h2ph?) (@INC contains: /usr/local/lib64/perl5/5.32 /usr/local/share/perl5/5.32 /usr/lib64/perl5/vendor_perl /usr/share/perl5/vendor_perl /usr/lib64/perl5 /usr/share/perl5) at /usr/share/perl5/vendor_perl/Warewulf/Network.pm line 48.
Compilation failed in require at /usr/share/perl5/vendor_perl/Warewulf/Ipmi.pm line 16.
BEGIN failed--compilation aborted at /usr/share/perl5/vendor_perl/Warewulf/Ipmi.pm line 16.
Compilation failed in require at /usr/share/perl5/vendor_perl/Warewulf/Module/Cli/Ipmi.pm line 20.
BEGIN failed--compilation aborted at /usr/share/perl5/vendor_perl/Warewulf/Module/Cli/Ipmi.pm line 20.
Compilation failed in require at /usr/share/perl5/vendor_perl/Warewulf/ModuleLoader.pm line 80.

WARNING:  Module load error:  Could not invoke Warewulf::Module::Cli::Ipmi->new():  Can't locate object method "new" via package "Warewulf::Module::Cli::Ipmi" at (eval 18) line 1.

WARNING:  Caught error on module load:  Attempt to reload Warewulf/Network.pm aborted.
Compilation failed in require at /usr/share/perl5/vendor_perl/Warewulf/Module/Cli/Node.pm line 18.
BEGIN failed--compilation aborted at /usr/share/perl5/vendor_perl/Warewulf/Module/Cli/Node.pm line 18.
Compilation failed in require at /usr/share/perl5/vendor_perl/Warewulf/ModuleLoader.pm line 80.

If I ignore and move on (because it will ask if I want to import the file) I eventually get to the next major error when I am trying to provision the first node:

ERROR:  Warewulf command 'node' not available

This happens when I issue the command wwsh -y node new.... command.

The only thing different from the recipe is that I installed Rocky 9.3. The rest is exactly as it is. Has anyone experienced this? What could be the problem?

Cheers

adrianreber commented 4 months ago

Just tested in on Almalinux and the missing file (Can't locate sys/ioctl.ph in) is part of the package perl-ph which is pulled in by the package perl.

But, on our CI for the upcoming release 3.1 I do not see the package listed in the output: http://repos.ohpc.io/stats/results/3/3.1/0-LATEST-OHPC-3.1-almalinux9.2-x86_64/

Ah, our kickstart files for the initial installation are pulling in perl-ph. So installing perl-ph should fix it for you and we should add a dependency on perl-ph for the warewulf-ipmi-ohpc package.

luciandf commented 4 months ago

i'll test is out now!

adrianreber commented 4 months ago

Please re-open if the installing perl-ph does not help. For 3.1 it should be pulled in automatically.

luciandf commented 4 months ago

I still have the following error:

 ERROR:  Warewulf command 'node' not available

when wwsh -y node new... command is issued. Followed by this error:

readline() on closed filehandle GENDERS at /usr/share/perl5/vendor_perl/Warewulf/Provision/Genders.pm line 92.
Use of uninitialized value $genders in scalar chomp at /usr/share/perl5/vendor_perl/Warewulf/Provision/Genders.pm line 101.
Are you sure you want to make the following changes to 2 node(s):

I have not installed genders!

Also, now that fewer errors appeared I also see:

⚠️ /proc/ is not mounted. This is not a supported mode of operation. Please fix
your invocation environment to mount /proc/ and /sys/ properly. Proceeding anyway.
Your mileage may vary.
adrianreber commented 4 months ago

Looks like you need to install genders-ohpc.

The /proc message needs to be ignored.

luciandf commented 4 months ago

Looks like you need to install genders-ohpc.

The /proc message needs to be ignored.

Worked! Thank you!