rackslab / Slurm-web

Open source web interface for Slurm HPC clusters
https://slurm-web.com
GNU General Public License v3.0
347 stars 98 forks source link

slurm-web-agent not working on rhel 8 #419

Open jonoharms opened 1 day ago

jonoharms commented 1 day ago

I have been trying to get slurm-web working on rhel8. I have successfully installed everything using the rackslab repo. Racksdb and slurm-web-gateway both seem to be working fine. However, the slurm-web-agent will not start. It seems like the python3-werkzeug package that is installed is not recent enough, because it does not contain werkzeug.middleware. The installed version is 0.12.2.

werkzeug.middleware was added in 0.15

This is the output from sudo journalctl -u slurm-web-agent:

Dec 03 16:19:57 hostname systemd[1]: Started Slurm-web HPC dashboard agent.
Dec 03 16:19:57 hostname python3.6[267350]: detected unhandled Python exception in '/usr/libexec/slurm-web/slurm-web-agent'
Dec 03 16:19:57 hostname slurm-web-agent[267350]: Traceback (most recent call last):
Dec 03 16:19:57 hostname slurm-web-agent[267350]:   File "/usr/libexec/slurm-web/slurm-web-agent", line 11, in <module>
Dec 03 16:19:57 hostname slurm-web-agent[267350]:     load_entry_point('Slurm-web==4.0.0', 'console_scripts', 'slurm-web-agent')()
Dec 03 16:19:57 hostname slurm-web-agent[267350]:   File "/usr/lib/python3.6/site-packages/pkg_resources/__init__.py", line 476, in load_entry_point
Dec 03 16:19:57 hostname slurm-web-agent[267350]:     return get_distribution(dist).load_entry_point(group, name)
Dec 03 16:19:57 hostname slurm-web-agent[267350]:   File "/usr/lib/python3.6/site-packages/pkg_resources/__init__.py", line 2700, in load_entry_point
Dec 03 16:19:57 hostname slurm-web-agent[267350]:     return ep.load()
Dec 03 16:19:57 hostname slurm-web-agent[267350]:   File "/usr/lib/python3.6/site-packages/pkg_resources/__init__.py", line 2318, in load
Dec 03 16:19:57 hostname slurm-web-agent[267350]:     return self.resolve()
Dec 03 16:19:57 hostname slurm-web-agent[267350]:   File "/usr/lib/python3.6/site-packages/pkg_resources/__init__.py", line 2324, in resolve
Dec 03 16:19:57 hostname slurm-web-agent[267350]:     module = __import__(self.module_name, fromlist=['__name__'], level=0)
Dec 03 16:19:57 hostname slurm-web-agent[267350]:   File "/usr/lib/python3.6/site-packages/slurmweb/exec/agent.py", line 14, in <module>
Dec 03 16:19:57 hostname slurm-web-agent[267350]:     from ..apps.agent import SlurmwebAppAgent
Dec 03 16:19:57 hostname slurm-web-agent[267350]:   File "/usr/lib/python3.6/site-packages/slurmweb/apps/agent.py", line 13, in <module>
Dec 03 16:19:57 hostname slurm-web-agent[267350]:     from werkzeug.middleware import dispatcher
Dec 03 16:19:57 hostname slurm-web-agent[267350]: ModuleNotFoundError: No module named 'werkzeug.middleware'
Dec 03 16:19:57 hostname systemd[1]: slurm-web-agent.service: Main process exited, code=exited, status=1/FAILURE
Dec 03 16:19:57 hostname systemd[1]: slurm-web-agent.service: Failed with result 'exit-code'.
Dec 03 16:19:57 hostname systemd[1]: slurm-web-agent.service: Service RestartSec=100ms expired, scheduling restart.
Dec 03 16:19:57 hostname systemd[1]: slurm-web-agent.service: Scheduled restart job, restart counter is at 1.
Dec 03 16:19:57 hostname systemd[1]: Stopped Slurm-web HPC dashboard agent.
Dec 03 16:19:57 hostname systemd[1]: Started Slurm-web HPC dashboard agent.
rezib commented 15 hours ago

Hello @jonoharms, thank you very much for reporting! I consider this a serious bug.

I will work on a fix ASAP.

rezib commented 15 hours ago

I managed to reproduce it in CI.

rezib commented 13 hours ago

@jonoharms, I just published in the repo RPM packages 4.0.0-2 for el8 with the patch developed in #420 included. Can you please try to update and confirm it works for you?

jonoharms commented 12 hours ago

Thanks rezib, I will try when I'm back at work tomorrow morning (Australian time).

jonoharms commented 1 minute ago

it is working now! thankyou very much for the quick turnaround.