simonwep / ocular-docker

Ready-to-use docker compose setup for ocular ✨
https://github.com/simonwep/ocular
10 stars 1 forks source link

Ocular not accessible through network IP #2

Closed thedmmatt closed 4 months ago

thedmmatt commented 5 months ago

Issue Description

Ocular frontend randomly ends the user session, and then I can't log back in (either as "admin" or my other users). Nginx throws authentication errors specifically in this order:

Kindly note that it does not prompt that the user/password are wrong as the users exist.

Solutions so far

Only managed to "fix" this by destroying the containers, removing the images, deleting all the folders and recreating them with a different name, then re-pulling both the images and the stack of containers.

Configuration

Error Messages

2024-05-15 11:07:01 172.28.0.1 - - [15/May/2024:14:07:01 +0000] "POST /api/login HTTP/1.1" 200 28 "http://XXX.XXX.X.X:3030/" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/124.0.0.0 Safari/537.36"
2024-05-15 11:07:01 172.28.0.1 - - [15/May/2024:14:07:01 +0000] "GET /api/user HTTP/1.1" 403 0 "http://XXX.XXX.X.X:3030/" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/124.0.0.0 Safari/537.36"
2024-05-15 11:07:01 172.28.0.1 - - [15/May/2024:14:07:01 +0000] "GET /api/data/data HTTP/1.1" 401 0 "http://XXX.XXX.X.X:3030/" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/124.0.0.0 Safari/537.36"
2024-05-15 11:07:01 172.28.0.1 - - [15/May/2024:14:07:01 +0000] "GET /api/data/settings HTTP/1.1" 401 0 "http://XXX.XXX.X.X:3030/" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/124.0.0.0 Safari/537.36"
2024-05-15 11:07:01 172.28.0.1 - - [15/May/2024:14:07:01 +0000] "POST /api/logout HTTP/1.1" 401 0 "http://XXX.XXX.X.X:3030/" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/124.0.0.0 Safari/537.36"

PS.: XXX.XXX.X.X refers to my host IP as network is bridged. I'm not setting static IP to the containers for obvious reasons.

YAML Configuration

services:
  backend:
    image: ghcr.io/simonwep/genesis:v1.0
    restart: unless-stopped
    volumes:
      - ./data:/app/.data
      - /var/run/docker.sock:/var/run/docker.sock:ro
    environment:
      - GENESIS_PORT
      - GENESIS_DB_PATH
      - GENESIS_CREATE_USERS
      - GENESIS_AUTHORIZED_URIS
      - GENESIS_JWT_SECRET
      - GENESIS_JWT_TOKEN_EXPIRATION
      - GENESIS_USERNAME_PATTERN
      - GENESIS_KEY_PATTERN
      - GENESIS_DATA_MAX_SIZE
      - GENESIS_KEYS_PER_USER
      - GENESIS_GIN_MODE
      - GENESIS_LOG_MODE

  frontend:
    image: ghcr.io/simonwep/ocular:v1.4
    restart: unless-stopped
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock:ro

  nginx:
    image: nginx:1.24-alpine
    restart: unless-stopped
    ports:
      - 3030:80
    volumes:
      - ./config/nginx.conf:/etc/nginx/nginx.conf
      - /var/run/docker.sock:/var/run/docker.sock:ro
    depends_on:
      - backend
      - frontend
simonwep commented 5 months ago

Hey! Sorry to hear that you're still having issues... I just tried it via Docker Desktop for windows, configured to use the WSL 2 backend and had no problems. Just two things that were, apparently, different:

May I ask why you added this volume and what its purpose is? Also when does this issue exactly occur, right after the first login? After restarting the container?

Maybe you can give me step-by-step instructions of how to reproduce it, because generally there are way to many things that may go wrong depending on the infrastructure you're setting it up with 😬

thedmmatt commented 5 months ago

Hi @simonwep, thank you for picking this one quickly.

  • I used windows 10 (I currently don't have access to a windows 11 machine)

I'm currently spinning a Windows 10 VM to check whether or not the host OS is the culprit.

  • I didn't add the - /var/run/docker.sock:/var/run/docker.sock:ro volume.

May I ask why you added this volume and what its purpose is?

Sure! Basically it allows one container to communicate with others through Docker Desktop's daemon socket (read-only mode). In my case, this allows me to show Ocular's container stats on Homepage as well as checking heartbeats using Uptime Kuma.

Also when does this issue exactly occur, right after the first login? After restarting the container?

Usually it starts a couple of hours after the first login when the session is finished, or if the container/WSL2/Docker/Host restarts. One thing that I noticed is that if the container crashes, sometimes the backend will not load, but I'm still investigating as I think it has to do with docker-desktop-bind-mounts.

Maybe you can give me step-by-step instructions of how to reproduce it, because generally there are way to many things that may go wrong depending on the infrastructure you're setting it up with 😬

simonwep commented 5 months ago

@thedmmatt Thank you for the detailed steps for reproduction! The docker compose build should not be necessary since there are no containers that need to be build.

Coming back to the application a couple of hours later and being unable to log back in.

After this, when you restart the docker compose setup (without deleting anything) - does the issue still persist? And are there any interesting logs from the backend? When it starts it usually tells you how many users and data-sets are stored. Would be interesting to see if they are still there but just no longer accessible.

Usually it starts a couple of hours after the first login when the session is finished, or if the container/WSL2/Docker/Host restarts. One thing that I noticed is that if the container crashes, sometimes the backend will not load, but I'm still investigating as I think it has to do with docker-desktop-bind-mounts.

Yeah there seem to be many, many, many threads about this being an issue with windows. I strongly feel like this isn't a problem with ocular in particular because I'm neither using any special docker feature, nor is the setup very complicated/throughout . Is there any chance for you to maybe run it on an ubuntu machine? Can even be something simple as a raspberry pi, the stack needs very, very little resources (just a few mb or even less besides docker).

I'm unfortunately not very familiar with running docker applications in WSL as I mainly work with linux/mac. So I can only give you very little guidance with that :/

You could try binding non :ro-volumes, what confuses me about that, that you apparently can store data but are unable to log back in to the app at some point. Which seems like that at some point the backend data (mounted at ./data) is no longer readable or "gone".

Another solution might be, since you're running windows, using regular volumes because they have better cross-platform compatibility. I just prefer host mounts because it makes it easier on linux servers to back up data.

thedmmatt commented 5 months ago

@thedmmatt Thank you for the detailed steps for reproduction! The docker compose build should not be necessary since there are no containers that need to be build.

Oh, I've been deploying Ocular on a stack with other solutions to spin things up quicker.

After this, when you restart the docker compose setup (without deleting anything) - does the issue still persist? And are there any interesting logs from the backend? When it starts it usually tells you how many users and data-sets are stored. Would be interesting to see if they are still there but just no longer accessible.

Unfortunately it persists, even if I recursively chmod +x the whole data folder. In the backend container I can still see the number of users (3), datastes (2) and also expired keys (1). The volume is persistent, so the backend has no trouble accessing it.

Yeah there seem to be many, many, many threads about this being an issue with windows. I strongly feel like this isn't a problem with ocular in particular because I'm neither using any special docker feature, nor is the setup very complicated/throughout . Is there any chance for you to maybe run it on an ubuntu machine? Can even be something simple as a raspberry pi, the stack needs very, very little resources (just a few mb or even less besides docker).

I'm unfortunately not very familiar with running docker applications in WSL as I mainly work with linux/mac. So I can only give you very little guidance with that :/

I thought it was Windows as well, but I gave a couple of days on both of my Linux and Mac devices, ending in the same result. :/

You could try binding non :ro-volumes, what confuses me about that, that you apparently can store data but are unable to log back in to the app at some point. Which seems like that at some point the backend data (mounted at ./data) is no longer readable or "gone".

Another solution might be, since you're running windows, using regular volumes because they have better cross-platform compatibility. I just prefer host mounts because it makes it easier on linux servers to back up data.

Even with regular volumes the issue eventually returns. Yesterday I uploaded Ocular on a Linode VM, and everything was working fine, but now I'm checking the nginx container and it's throwing the same messages. Similar to the previous attempts, the login form does not prompt that the user/pass is wrong -- it just closes as if it was successful, but then it remains disconnected (red crossed cloud icon), with the same HTTP error codes in the nginx logs. :(

simonwep commented 5 months ago

Unfortunately it persists, even if I recursively chmod +x the whole data folder. In the backend container I can still see the number of users (3), datastes (2) and also expired keys (1). The volume is persistent, so the backend has no trouble accessing it.

Oh that's interesting, so the data is still there but you just can't login anymore?

I thought it was Windows as well, but I gave a couple of days on both of my Linux and Mac devices, ending in the same result. :/

This is so strange, especially because I have it running since the beginning, exactly as described in the readme, on an Ubuntu v22 VM on Amazon AWS for about a year now. With zero issues. Just installed docker there, docker compose up -d and that's it. Nothing more. Also I'm not super familiar with DevOps, so I took the easiest route for the setup.

Did you use/do anything else on Linux/Mac besides installing docker as they tell you on their homepage? Because I'm beginning to have a hard time thinking about how this could be an issue with this stack, there is just too... little to go wrong here. Did you try any other projects or do you have anything else running that works and could be compared to such a setup?

Even with regular volumes the issue eventually returns. Yesterday I uploaded Ocular on a Linode VM, and everything was working fine, but now I'm checking the nginx container and it's throwing the same messages. Similar to the previous attempts, the login form does not prompt that the user/pass is wrong -- it just closes as if it was successful, but then it remains disconnected (red crossed cloud icon), with the same HTTP error codes in the nginx logs. :(

Yeah I took another look at the error message you posted in the first commend, and it seems like it can authenticate (so the cookie should be set, that's what genesis is using) but fails to send it back with the up-following requests. Is there anything that may be blocking it client-side?

I'm really curious about solving this problem haha - but I have zero idea what this is causing it. What I have tested it on so far is, you can tell me your setups in case I can reproduce one of it easily:

On all of them I did the exact steps mentioned and used docker v25 or v24 depending on the setup and time of trying.

thedmmatt commented 5 months ago

Thank you @simonwep for really taking the time to help here. I think I got the error now, which I'll detail before addressing your notes from above.

TL;DR: I'm able to access Ocular only at the host through localhost naming (i.e. localhost and 127.0.0.1, but not its IP address). So no Ocular using other devices. :(


Since the errors that Nginx was throwing are related to permission, not authentication itself, I tried several ways of accessing Ocular through a browser. 127.0.0.1:3030 and localhost:3030 are able to establish a session immediately, whereas 192.168.X.X:3030 does not (same errors I reported earlier). I tried the following -- individually and combined -- to no avail:

Ocular is supposed to be accessible to other devices in the same network, right?


Oh that's interesting, so the data is still there but you just can't login anymore?

Exactly, so I can access the data normally when browsing to localhost.

Did you use/do anything else on Linux/Mac besides installing docker as they tell you on their homepage?

Yes. I have even spun up a new Ubuntu VM with only docker-ce and docker-ce-cli to avoid the Docker Desktop clutter.

Did you try any other projects or do you have anything else running that works and could be compared to such a setup?

I have a couple of applications similar in structure, all using Nginx/Alpine, so I'm currently reading their codes to check whether there's some line doing the magic. haha

Is there anything that may be blocking it client-side?

I had that guess too. Tried disabling PiHole, turned off Dream Machine security, tried other clients and browsers with no extensions. Still the same result. :(

I'm really curious about solving this problem haha - but I have zero idea what this is causing it. What I have tested it on so far is, you can tell me your setups in case I can reproduce one of it easily:

Me too! Although wrapping my head around, I'm really invested in the solution now HAHAHA And sure, here's the details of what I have tried so far:

Cloud

Onprem

(can pass the full build if that's necessary)

simonwep commented 5 months ago

Thank you for providing your environments, that's a lot haha - damn. I'm impressed hahaha

TL;DR: I'm able to access Ocular only at the host through localhost naming (i.e. localhost and 127.0.0.1, but not its IP address). So no Ocular using other devices. :(

The frontend and backend are both behind an nginx proxy inside the docker compose setup. So there really should only be one IP you would need to re-route requests to. The only thing I use/access on the local network as well is PiHole... not sure what they're doing differently.

Ocular is supposed to be accessible to other devices in the same network, right?

I'm not "extremely" experienced with networking stuff. Either way this should be configurable via docker or on your system. I run it behind an NGINX proxy in production - so at least that seems to be working.

But I'm finally able to reproduce it! I think the nginx config is wrong, I'll report back soon :)

simonwep commented 5 months ago

I got it, it's because the cookie used for authentication is marked as secure. It doesn't matter on localhost, but in a local network it does. I'll make this optional to make it work for both usages.

image
simonwep commented 5 months ago

@thedmmatt please try the latest release, make sure to set GENESIS_JWT_COOKIE_ALLOW_HTTP to true if you don't want to use https. I just tested this and it works locally over the network!

thedmmatt commented 4 months ago

Hi @simonwep, that's awesome! I gave it some days to make sure the issue wouldn't come back, and so far everything is behaving as expected. Kudos!

simonwep commented 4 months ago

Awesome! I'll close this one for now then, thank you very much for your help again :)