aquarist-labs / aquarium

Project Aquarium is a SUSE-sponsored open source project aiming at becoming an easy to use, rock solid storage appliance based on Ceph.
https://aquarist-labs.io/
Other
71 stars 23 forks source link

Second node join succeeds, but GUI remains stuck on join page, shows 500 internal server error #632

Closed tserong closed 1 year ago

tserong commented 3 years ago

Describe the bug I used aqrdev to create a new deployment from the head of the main branch today, then went to http://localhost:1337/, created a new cluster, then went to http://localhost:1338/ to join the second node. This seems to have succeeded (the GUI on node1 lists both hosts), but http://localhost:1338/ remains stuck on the Join screen, with a 500 Internal server error visible. journalctl -fu aquarium on node2 gives:

Aug 18 06:56:18 node2 uvicorn[1497]: ERROR:    2021-08-18 06:56:18 -- h11_impl -- Exception in ASGI application
Aug 18 06:56:18 node2 uvicorn[1497]: Traceback (most recent call last):
Aug 18 06:56:18 node2 uvicorn[1497]:   File "/usr/local/lib/python3.8/site-packages/uvicorn/protocols/http/h11_impl.py", line 394, in run_asgi
Aug 18 06:56:18 node2 uvicorn[1497]:     result = await app(self.scope, self.receive, self.send)
Aug 18 06:56:18 node2 uvicorn[1497]:   File "/usr/local/lib/python3.8/site-packages/uvicorn/middleware/proxy_headers.py", line 45, in __call__
Aug 18 06:56:18 node2 uvicorn[1497]:     return await self.app(scope, receive, send)
Aug 18 06:56:18 node2 uvicorn[1497]:   File "/usr/local/lib/python3.8/site-packages/fastapi/applications.py", line 199, in __call__
Aug 18 06:56:18 node2 uvicorn[1497]:     await super().__call__(scope, receive, send)
Aug 18 06:56:18 node2 uvicorn[1497]:   File "/usr/local/lib/python3.8/site-packages/starlette/applications.py", line 111, in __call__
Aug 18 06:56:18 node2 uvicorn[1497]:     await self.middleware_stack(scope, receive, send)
Aug 18 06:56:18 node2 uvicorn[1497]:   File "/usr/local/lib/python3.8/site-packages/starlette/middleware/errors.py", line 181, in __call__
Aug 18 06:56:18 node2 uvicorn[1497]:     raise exc from None
Aug 18 06:56:18 node2 uvicorn[1497]:   File "/usr/local/lib/python3.8/site-packages/starlette/middleware/errors.py", line 159, in __call__
Aug 18 06:56:18 node2 uvicorn[1497]:     await self.app(scope, receive, _send)
Aug 18 06:56:18 node2 uvicorn[1497]:   File "/usr/local/lib/python3.8/site-packages/starlette/exceptions.py", line 82, in __call__
Aug 18 06:56:18 node2 uvicorn[1497]:     raise exc from None
Aug 18 06:56:18 node2 uvicorn[1497]:   File "/usr/local/lib/python3.8/site-packages/starlette/exceptions.py", line 71, in __call__
Aug 18 06:56:18 node2 uvicorn[1497]:     await self.app(scope, receive, sender)
Aug 18 06:56:18 node2 uvicorn[1497]:   File "/usr/local/lib/python3.8/site-packages/starlette/routing.py", line 566, in __call__
Aug 18 06:56:18 node2 uvicorn[1497]:     await route.handle(scope, receive, send)
Aug 18 06:56:18 node2 uvicorn[1497]:   File "/usr/local/lib/python3.8/site-packages/starlette/routing.py", line 376, in handle
Aug 18 06:56:18 node2 uvicorn[1497]:     await self.app(scope, receive, send)
Aug 18 06:56:18 node2 uvicorn[1497]:   File "/usr/local/lib/python3.8/site-packages/fastapi/applications.py", line 199, in __call__
Aug 18 06:56:18 node2 uvicorn[1497]:     await super().__call__(scope, receive, send)
Aug 18 06:56:18 node2 uvicorn[1497]:   File "/usr/local/lib/python3.8/site-packages/starlette/applications.py", line 111, in __call__
Aug 18 06:56:18 node2 uvicorn[1497]:     await self.middleware_stack(scope, receive, send)
Aug 18 06:56:18 node2 uvicorn[1497]:   File "/usr/local/lib/python3.8/site-packages/starlette/middleware/errors.py", line 181, in __call__
Aug 18 06:56:18 node2 uvicorn[1497]:     raise exc from None
Aug 18 06:56:18 node2 uvicorn[1497]:   File "/usr/local/lib/python3.8/site-packages/starlette/middleware/errors.py", line 159, in __call__
Aug 18 06:56:18 node2 uvicorn[1497]:     await self.app(scope, receive, _send)
Aug 18 06:56:18 node2 uvicorn[1497]:   File "/usr/local/lib/python3.8/site-packages/starlette/exceptions.py", line 82, in __call__
Aug 18 06:56:18 node2 uvicorn[1497]:     raise exc from None
Aug 18 06:56:18 node2 uvicorn[1497]:   File "/usr/local/lib/python3.8/site-packages/starlette/exceptions.py", line 71, in __call__
Aug 18 06:56:18 node2 uvicorn[1497]:     await self.app(scope, receive, sender)
Aug 18 06:56:18 node2 uvicorn[1497]:   File "/usr/local/lib/python3.8/site-packages/starlette/routing.py", line 566, in __call__
Aug 18 06:56:18 node2 uvicorn[1497]:     await route.handle(scope, receive, send)
Aug 18 06:56:18 node2 uvicorn[1497]:   File "/usr/local/lib/python3.8/site-packages/starlette/routing.py", line 227, in handle
Aug 18 06:56:18 node2 uvicorn[1497]:     await self.app(scope, receive, send)
Aug 18 06:56:18 node2 uvicorn[1497]:   File "/usr/local/lib/python3.8/site-packages/starlette/routing.py", line 41, in app
Aug 18 06:56:18 node2 uvicorn[1497]:     response = await func(request)
Aug 18 06:56:18 node2 uvicorn[1497]:   File "/usr/local/lib/python3.8/site-packages/fastapi/routing.py", line 191, in app
Aug 18 06:56:18 node2 uvicorn[1497]:     solved_result = await solve_dependencies(
Aug 18 06:56:18 node2 uvicorn[1497]:   File "/usr/local/lib/python3.8/site-packages/fastapi/dependencies/utils.py", line 548, in solve_dependencies
Aug 18 06:56:18 node2 uvicorn[1497]:     solved = await call(**sub_values)
Aug 18 06:56:18 node2 uvicorn[1497]:   File "/usr/share/aquarium/./gravel/api/__init__.py", line 37, in __call__
Aug 18 06:56:18 node2 uvicorn[1497]:     raw_token: JWT = jwt_mgr.get_raw_access_token(token)
Aug 18 06:56:18 node2 uvicorn[1497]:   File "/usr/share/aquarium/./gravel/controllers/auth.py", line 119, in get_raw_access_token
Aug 18 06:56:18 node2 uvicorn[1497]:     raw_token = jwt.decode(
Aug 18 06:56:18 node2 uvicorn[1497]:   File "/usr/local/lib/python3.8/site-packages/jwt/api_jwt.py", line 119, in decode
Aug 18 06:56:18 node2 uvicorn[1497]:     decoded = self.decode_complete(jwt, key, algorithms, options, **kwargs)
Aug 18 06:56:18 node2 uvicorn[1497]:   File "/usr/local/lib/python3.8/site-packages/jwt/api_jwt.py", line 90, in decode_complete
Aug 18 06:56:18 node2 uvicorn[1497]:     decoded = api_jws.decode_complete(
Aug 18 06:56:18 node2 uvicorn[1497]:   File "/usr/local/lib/python3.8/site-packages/jwt/api_jws.py", line 149, in decode_complete
Aug 18 06:56:18 node2 uvicorn[1497]:     self._verify_signature(signing_input, header, signature, key, algorithms)
Aug 18 06:56:18 node2 uvicorn[1497]:   File "/usr/local/lib/python3.8/site-packages/jwt/api_jws.py", line 236, in _verify_signature
Aug 18 06:56:18 node2 uvicorn[1497]:     raise InvalidSignatureError("Signature verification failed")
Aug 18 06:56:18 node2 uvicorn[1497]: jwt.exceptions.InvalidSignatureError: Signature verification failed

To Reproduce 1) Use aqrdev to create a new deployment. 2) Go to http://localhost:1337/ and create a new cluster. Login to the dashboard and get the IP address and auth token from node1. 3) Go to http://localhost:1338/ and tell that node to join an existing cluster with the details from item 2. 4) Observe that you get stuck on the join screen with that 500 internal server error

Expected behavior The join succeeds and I'm able to access the dashboard on node2

Screenshots broken-join

Additional context I'm using Firefox on openSUSE Tumbleweed.

Could the problem here be something to do with using the localhost:1337 and localhost:1338 port forwards? Will this cause the browser to do something stupid/unexpected with session cookies?

tserong commented 3 years ago

Yep, it's gotta be something to do with the localhost port forwards. If I retry using (in this case) cluster creation on node1 via http://192.168.121.155:1337/ and join on node2 via http://192.168.121.43:1337, it all works fine (although I still saw a red "500 Internal Server Error" box pop up once on node2's login screen). So I'm not sure if this is something that can/should be fixed, or just needs to be documented for those doing dev/test.

AvengerMoJo commented 3 years ago

I have a similar issue when I forward my screen even outside of the vm box to my desktop when I was doing a demo presentation. The login screen didn't show up when accessing the dashboard page, but nothing work. So I have to manually going to the login page and then login and everything back to normal.

tserong commented 3 years ago

(although I still saw a red "500 Internal Server Error" box pop up once on node2's login screen)

Actually, that was a "401 Unauthorized", not a "500 Internal Server Error", and the fact that the user is unauthorized at that point in time probably makes sense. Still, I don't think that error should appear, given the user hasn't actually attempted to log in yet.