linuxserver / docker-unifi-controller

GNU General Public License v3.0
899 stars 125 forks source link

MongoDB Exception in thread "RefreshSsoTokens" #135

Closed Idan37S closed 1 year ago

Idan37S commented 2 years ago

linuxserver.io


Expected Behavior

Unifi Controller should run without issues

Current Behavior

I'm getting a MongoDB exception after running the Unifi Controller for like a day or two, it keeps happening all the time. Once it happens Unifi Controller isn't working properly and isn't responding. Only a restart to the container solves it temporarily till it happens again. I'm using the latest version of the container, but it's happening to me for a few months now (like 4 months), Just hadn't time to handle and try to solve it till now.

Steps to Reproduce

I'm using Unraid v6.9.2 to host the Unifi Controller container, And the exception just happens after a while )day or two) without doing anything.

Environment

OS: CPU architecture: x86_64/arm32/arm64 How docker service was installed:

Unraid v6.9.2 Intel i9 10850K

Command used to create docker container (run/create/compose/screenshot)

Unraid Community Store

Docker logs

The exception from the logs:

Exception in thread "RefreshSsoTokens" com.mongodb.MongoTimeoutException: Timed out after 30000 ms while waiting for a server that matches ReadPreferenceServerSelector{readPreference=primary}. Client view of cluster state is {type=UNKNOWN, servers=[{address=localhost:27117, type=UNKNOWN, state=CONNECTING, exception={com.mongodb.MongoSocketOpenException: Exception opening socket}, caused by {java.net.ConnectException: Connection refused (Connection refused)}}] at com.mongodb.connection.BaseCluster.createTimeoutException(BaseCluster.java:377) at com.mongodb.connection.BaseCluster.selectServer(BaseCluster.java:104) at com.mongodb.binding.ClusterBinding$ClusterBindingConnectionSource.(ClusterBinding.java:75) at com.mongodb.binding.ClusterBinding$ClusterBindingConnectionSource.(ClusterBinding.java:71) at com.mongodb.binding.ClusterBinding.getReadConnectionSource(ClusterBinding.java:63) at com.mongodb.operation.OperationHelper.withConnection(OperationHelper.java:406) at com.mongodb.operation.FindOperation.execute(FindOperation.java:510) at com.mongodb.operation.FindOperation.execute(FindOperation.java:81) at com.mongodb.Mongo.execute(Mongo.java:836) at com.mongodb.Mongo$2.execute(Mongo.java:823) at com.mongodb.DBCursor.initializeCursor(DBCursor.java:870) at com.mongodb.DBCursor.hasNext(DBCursor.java:142) at com.ubnt.service.system.n.Ò00000(Unknown Source) at com.ubnt.service.sso.ooOO.new(Unknown Source) at com.ubnt.service.sso.ooOO.o00000(Unknown Source) at com.ubnt.service.sso.ooOO$_o.run(Unknown Source) at java.util.TimerThread.mainLoop(Timer.java:555) at java.util.TimerThread.run(Timer.java:505)

github-actions[bot] commented 2 years ago

Thanks for opening your first issue here! Be sure to follow the bug or feature issue templates!

github-actions[bot] commented 2 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

colemorgan commented 2 years ago

Having the same issue here. Anyone find a fix for this?

jeroenhe commented 2 years ago

I have a similar exception:

Exception in thread "WsSitesList" com.mongodb.MongoTimeoutException: Timed out after 30000 ms while waiting for a server that matches ReadPreferenceServerSelector{readPreference=primary}. Client view of cluster state is {type=UNKNOWN, servers=[{address=localhost:27117, type=UNKNOWN, state=CONNECTING, exception={com.mongodb.MongoSocketOpenException: Exception opening socket}, caused by {java.net.ConnectException: Connection refused (Connection refused)}}]

I've remedied the problem by modifying/adding a healthcheck that checks for a mongod process running. If not, my container is restarted. Of course this is just a piece of duct tape. I've already tried reducing the number of historical data stored in mongodb, but that hasn't helped.

github-actions[bot] commented 2 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

colemorgan commented 1 year ago

Bump

j0nnymoe commented 1 year ago

Im not sure this is something for us to fix? We don't actually do anything with the mongodb config other than what unifi does itself during setup.

colemorgan commented 1 year ago

I'm not 100% sure how mongodb is getting into this state, but it takes the UniFi controller down every 1-3 days. I've tried fresh installs and this happens every time. So I assume there is something to fix, but I'm not sure if this is on the UniFi side, or mongo..

j0nnymoe commented 1 year ago

It's not a definite it takes down unifi every 1-3 days, my instance has been only for a number of months. I suspect there is something specific to your setup causing this otherwise I'd expect a lot more reports of this issue.

jeroenhe commented 1 year ago

It's not a definite it takes down unifi every 1-3 days, my instance has been only for a number of months. I suspect there is something specific to your setup causing this otherwise I'd expect a lot more reports of this issue.

I still experience issues with mongodb crashing. My mongodb crashes on average 1 to 3 times per day. I also have quite a lot of wifi devices running (50+).

j0nnymoe commented 1 year ago

If someone could provide some compose etc about how they have unifi deployed, that would be good but this is likely (as we've seen many times before) an issue with unifi/mongodb than an issue with our container.

colemorgan commented 1 year ago

If someone could provide some compose etc about how they have unifi deployed, that would be good but this is likely (as we've seen many times before) an issue with unifi/mongodb than an issue with our container.

Here is my compose

---
version: "2.1"
services:
  unifi-controller:
    image: lscr.io/linuxserver/unifi-controller:latest
    container_name: unifi-controller
    environment:
      - PUID=1000
      - PGID=1000
      - MEM_LIMIT=1024 #optional
      - MEM_STARTUP=1024 #optional
    volumes:
       - /thedock/unifi/config:/config
    ports:
      - 8443:8443
      - 3478:3478/udp
      - 10001:10001/udp
      - 8080:8080
      - 1900:1900/udp #optional
      - 8843:8843 #optional
      - 8880:8880 #optional
      - 6789:6789 #optional
      - 5514:5514/udp #optional
    restart: unless-stopped
j0nnymoe commented 1 year ago

could you try without the mem limits?

colemorgan commented 1 year ago

Oh man, that would be too obvious! Testing now and already at 1.2gb

colemorgan commented 1 year ago

Down again

Exception in thread "WsSitesList" com.mongodb.MongoTimeoutException: Timed out after 30000 ms while waiting for a server that matches ReadPreferenceServerSelector{readPreference=primary}. Client view of cluster state is {type=UNKNOWN, servers=[{address=localhost:27117, type=UNKNOWN, state=CONNECTING, exception={com.mongodb.MongoSocketOpenException: Exception opening socket}, caused by {java.net.ConnectException: Connection refused (Connection refused)}}]
    at com.mongodb.connection.BaseCluster.createTimeoutException(BaseCluster.java:377)
    at com.mongodb.connection.BaseCluster.selectServer(BaseCluster.java:104)
    at com.mongodb.binding.ClusterBinding$ClusterBindingConnectionSource.<init>(ClusterBinding.java:75)
    at com.mongodb.binding.ClusterBinding$ClusterBindingConnectionSource.<init>(ClusterBinding.java:71)
    at com.mongodb.binding.ClusterBinding.getReadConnectionSource(ClusterBinding.java:63)
    at com.mongodb.operation.OperationHelper.withConnection(OperationHelper.java:406)
    at com.mongodb.operation.FindOperation.execute(FindOperation.java:510)
    at com.mongodb.operation.FindOperation.execute(FindOperation.java:81)
    at com.mongodb.Mongo.execute(Mongo.java:836)
    at com.mongodb.Mongo$2.execute(Mongo.java:823)
    at com.mongodb.DBCursor.initializeCursor(DBCursor.java:870)
    at com.mongodb.DBCursor.hasNext(DBCursor.java:142)
    at com.ubnt.service.system.voidnew.Ò00000(Unknown Source)
    at com.ubnt.service.sdn.G.o00000(Unknown Source)
    at com.ubnt.service.sdn.o0Oo.forwhile(Unknown Source)
    at com.ubnt.service.sdn.o0Oo.String(Unknown Source)
    at com.ubnt.service.sdn.o0Oo$_o.run(Unknown Source)
    at java.util.TimerThread.mainLoop(Timer.java:555)
    at java.util.TimerThread.run(Timer.java:505)
j0nnymoe commented 1 year ago

Your folder path, what's thedock?

jeroenhe commented 1 year ago

FWIW, this is my compose. I've already established the overall container memory limit does not seem to influence the mongodb crashes (they happen anyway), however I haven't monitored memory usage per service inside the container yet (hinting at unifi controller running in a jvm vs mongodb memory usage).

services:
  unifi:
    # https://hub.docker.com/r/linuxserver/unifi-controller/tags?page=1&ordering=last_updated
    image: linuxserver/unifi-controller:7.3.76
    container_name: unifi
    hostname: unifi
    restart: always
    ports:
      # https://help.ui.com/hc/en-us/articles/218506997-UniFi-Ports-Used
      # Port used for STUN.
      - "3478:3478/udp"
      # Port used for device and controller communication.
      - "8080:8080"
      # - "8081:8081"
      # Port used for controller GUI/API as seen in a web browser
      # - "8443:8443"
      # Port used for HTTPS portal redirection.
      # - "8843:8843"
      # Port used for HTTP portal redirection.
      # - "8880:8880"
      # Port used for device discovery
      - "10001:10001/udp"
    environment:
      - PGID=65539
      - PUID=1055
      - TZ=Europe/Amsterdam
    volumes:
      - type: bind
        source: /data/unifi/usr/lib/unifi/data
        target: /usr/lib/unifi/data
    logging:
      driver: "json-file"
      options:
        max-size: "10m"
        max-file: "2"
    mem_limit: 2500M
j0nnymoe commented 1 year ago

FWIW, this is my compose. I've already established the overall container memory limit does not seem to influence the mongodb crashes (they happen anyway), however I haven't monitored memory usage per service inside the container yet (hinting at unifi controller running in a jvm vs mongodb memory usage).

services:
  unifi:
    # https://hub.docker.com/r/linuxserver/unifi-controller/tags?page=1&ordering=last_updated
    image: linuxserver/unifi-controller:7.3.76
    container_name: unifi
    hostname: unifi
    restart: always
    ports:
      # https://help.ui.com/hc/en-us/articles/218506997-UniFi-Ports-Used
      # Port used for STUN.
      - "3478:3478/udp"
      # Port used for device and controller communication.
      - "8080:8080"
      # - "8081:8081"
      # Port used for controller GUI/API as seen in a web browser
      # - "8443:8443"
      # Port used for HTTPS portal redirection.
      # - "8843:8843"
      # Port used for HTTP portal redirection.
      # - "8880:8880"
      # Port used for device discovery
      - "10001:10001/udp"
    environment:
      - PGID=65539
      - PUID=1055
      - TZ=Europe/Amsterdam
    volumes:
      - type: bind
        source: /data/unifi/usr/lib/unifi/data
        target: /usr/lib/unifi/data
    logging:
      driver: "json-file"
      options:
        max-size: "10m"
        max-file: "2"
    mem_limit: 2500M

Quation, why is your volume bind that and not /config as it is mentioned in our documentation?

jeroenhe commented 1 year ago

Quation, why is your volume bind that and not /config as it is mentioned in our documentation?

That's a fair question. I've been using this great image for quite some time and I think I did this because this way I am able to access unifi's autobackups from outside the container (being stored inside /data/unifi/usr/lib/unifi/data/backup/autobackup).

That doesn't explain my lack for not mounting /config, so I'll go and change that.

But, bind-mounting /config doesn't help with the actual mongodb data, which seems to be in /data/unifi/usr/lib/unifi/data/db. If I don't bind-mount (or volume mount) this location (/data/unifi/usr/lib/unifi/data/db), how would a unifi container redeploy (stop, rm, run) prevent mongodb data from getting lost?

colemorgan commented 1 year ago

Your folder path, what's thedock?

That's an NFS share that I use for my dockers. It's so that I can easily access files from my dockers. It's run on TrueNas

j0nnymoe commented 1 year ago

Ok, would it be possible for you to test the container with it running on your local filesystem?

github-actions[bot] commented 1 year ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

kevinr commented 1 year ago

We're also seeing this issue, and we're seeing it on an official UDM Pro running the latest versions of Unifi OS and Unifi Network. I'd rather not post a debug file publicly, but I'm happy to send it to someone semi-official on request.

(Can we get the bot to stop trying to close this issue? Clearly there's something real here.)

Here's the status of unifi.service:

root@ubnt:/# systemctl status unifi.service
● unifi.service - unifi
   Loaded: loaded (/lib/systemd/system/unifi.service; enabled; vendor preset: enabled)
  Drop-In: /etc/systemd/system/unifi.service.d
           └─limits.conf, unifi-core.conf, unifi.conf
   Active: active (running) since Sun 2023-02-12 05:47:33 PST; 2 weeks 1 days ago
 Main PID: 2172 (java)
   Status: "{"version":"7.2.97","isConfigured":true,"info":{"timestamp":1677476786353,"startedAt":16762095
70509,"udmConnected":true,"isReadyForSetup":true,"udmProvisionCompleted":true},"controllerStatus":"READY",
"abridged":true,"unifiCareDevices":[]}"
    Tasks: 140 (limit: 4736)
   Memory: 1.2G (low: 1.0G high: 1.2G max: 1.5G swap max: 256.0M)
      CPU: 1d 8h 13min 20.959s
   CGroup: /system.slice/unifi.service
           ├─2172 /usr/bin/java -Dfile.encoding=UTF-8 -Djava.awt.headless=true -Dapple.awt.UIElement=true -D
unifi.core.enabled=true -Dorg.xerial.snappy.tempdir=/usr/lib/unifi/run -Xmx768M -Xss1024K -XX:MinHeapFreeR
atio=0 -XX:MaxHeapFreeRatio=50 -XX:-TieredCompilation -XX:+ExitOnOutOfMemoryError -XX:+CrashOnOutOfMemoryE
rror -XX:ErrorFile=/usr/lib/unifi/logs/hs_err_pidunifi.log -jar /usr/lib/unifi/lib/ace.jar start
           └─3456 bin/mongod --dbpath /usr/lib/unifi/data/db --port 27117 --unixSocketPrefix /usr/lib/unifi/
run --logRotate reopen --logappend --logpath /usr/lib/unifi/logs/mongod.log --pidfilepath /usr/lib/unifi/r
un/mongod.pid --wiredTigerEngineConfigString=cache_size=128M --bind_ip 127.0.0.1

Feb 27 02:49:13 ubnt java[2172]: Exception in thread "RefreshSsoTokens" com.mongodb.MongoTimeoutException:
 Timed out after 30000 ms while waiting for a server that matches ReadPreferenceServerSelector{readPrefere
nce=primary}. Client view of cluster state is {type=UNKNOWN, servers=[{address=localhost:27117, type=UNKNO
WN, state=CONNECTING, exception={com.mongodb.MongoSocketOpenException: Exception opening socket}, caused b
y {java.net.SocketTimeoutException}}]
Warning: Journal has been rotated since unit was started. Log output is incomplete or unavailable.
aptalca commented 1 year ago

Bot's doing its job, which is based on inactivity. Nothing to do with whether the issue is legit or not.

But I believe you also just told us that this is an upstream issue so it's best reported and tracked there as we don't develop the unifi app. We just maintain the docker container.

kevinr commented 1 year ago

Oh, sorry. Yes, I guess I didn't look closely enough at exactly whose Github repo this is.

It looks like there's no official public Github repo or bug tracker? I guess we paid for this box so I get to use official support channels then.

aptalca commented 1 year ago

No worries, thanks for confirming it affects native install as well. At least now we know it's not a container issue.

github-actions[bot] commented 1 year ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

github-actions[bot] commented 1 year ago

This issue is locked due to inactivity