rackslab / Slurm-web

Open source web dashboard for Slurm HPC clusters
https://slurm-web.com
GNU General Public License v3.0
311 stars 89 forks source link

Error in running slurm web on localhost #312

Closed Talavig closed 1 month ago

Talavig commented 1 month ago

I try to run slurm-web on a single, localhost cluster as a test. when i try to go to localhost:5011 from my browser, i get the following message: Server error: The server encountered an internal error and was unable to complete your request. Either the server is overloaded or there is an error in the application. when i look at the logs for slurm-web-agen, i see the following: Thread-858 (process_request_thread): [INFO] 127.0.0.1 - - [14/Jul/2024 13:58:32] "GET /v3.1.0/stats HTTP/1.1" 500 - Jul 14 13:58:32 standalone slurm-web-agent[7637]: simplejson.errors.JSONDecodeError: Expecting value: line 1 column 1 (char 0) Jul 14 13:58:32 standalone slurm-web-agent[7637]: return self.scan_once(s, idx=_w(s, idx).end()) Jul 14 13:58:32 standalone slurm-web-agent[7637]: File "/usr/lib/python3/dist-packages/simplejson/decoder.py", line 400, in raw_decode Jul 14 13:58:32 standalone slurm-web-agent[7637]: obj, end = self.raw_decode(s) Jul 14 13:58:32 standalone slurm-web-agent[7637]: File "/usr/lib/python3/dist-packages/simplejson/decoder.py", line 370, in decode Jul 14 13:58:32 standalone slurm-web-agent[7637]: return _default_decoder.decode(s) Jul 14 13:58:32 standalone slurm-web-agent[7637]: File "/usr/lib/python3/dist-packages/simplejson/__init__.py", line 525, in loads Jul 14 13:58:32 standalone slurm-web-agent[7637]: return complexjson.loads(self.text, **kwargs) Jul 14 13:58:32 standalone slurm-web-agent[7637]: File "/usr/lib/python3/dist-packages/requests/models.py", line 900, in json Jul 14 13:58:32 standalone slurm-web-agent[7637]: result = response.json() Jul 14 13:58:32 standalone slurm-web-agent[7637]: File "/usr/lib/python3/dist-packages/slurmweb/views/agent.py", line 51, in slurmrest Jul 14 13:58:32 standalone slurm-web-agent[7637]: items = func(*args) Jul 14 13:58:32 standalone slurm-web-agent[7637]: File "/usr/lib/python3/dist-packages/slurmweb/views/agent.py", line 76, in filter_fields Jul 14 13:58:32 standalone slurm-web-agent[7637]: return func(*args) Jul 14 13:58:32 standalone slurm-web-agent[7637]: File "/usr/lib/python3/dist-packages/slurmweb/views/agent.py", line 88, in _cached_data Jul 14 13:58:32 standalone slurm-web-agent[7637]: return _cached_data( Jul 14 13:58:32 standalone slurm-web-agent[7637]: File "/usr/lib/python3/dist-packages/slurmweb/views/agent.py", line 101, in _cached_jobs Jul 14 13:58:32 standalone slurm-web-agent[7637]: for job in _cached_jobs(): Jul 14 13:58:32 standalone slurm-web-agent[7637]: File "/usr/lib/python3/dist-packages/slurmweb/views/agent.py", line 236, in stats Jul 14 13:58:32 standalone slurm-web-agent[7637]: return view(*args, **kwargs) Jul 14 13:58:32 standalone slurm-web-agent[7637]: File "/usr/lib/python3/dist-packages/rfl/web/tokens.py", line 93, in wrapped Jul 14 13:58:32 standalone slurm-web-agent[7637]: return self.ensure_sync(self.view_functions[rule.endpoint])(**req.view_args) Jul 14 13:58:32 standalone slurm-web-agent[7637]: File "/usr/lib/python3/dist-packages/flask/app.py", line 1499, in dispatch_request Jul 14 13:58:32 standalone slurm-web-agent[7637]: rv = self.dispatch_request() Jul 14 13:58:32 standalone slurm-web-agent[7637]: File "/usr/lib/python3/dist-packages/flask/app.py", line 1513, in full_dispatch_request Jul 14 13:58:32 standalone slurm-web-agent[7637]: rv = self.handle_user_exception(e) Jul 14 13:58:32 standalone slurm-web-agent[7637]: File "/usr/lib/python3/dist-packages/flask/app.py", line 1515, in full_dispatch_request Jul 14 13:58:32 standalone slurm-web-agent[7637]: response = self.full_dispatch_request() Jul 14 13:58:32 standalone slurm-web-agent[7637]: File "/usr/lib/python3/dist-packages/flask/app.py", line 2070, in wsgi_app Its important to mention that i only manage to reach slurmrestd by using sudo in my curl command. maybe that's related? Why is this error thrown? In addition, can you only reach slurmrestd through a unix socket? is it not possible to open it up to a localhost http communication? because now, the way I access the slurm socket through the ui without running it with sudo is to chmod 777 slurmrestd.socket. before i did it, got an error saying Permission denied. Thanks in advance:)