No problems keeping devices connected / adopting devices when runing the container read-only.
What actually happens?
I had a switch that was never connecting to the controller since multiple weeks. I never really understood why. The switches info command said "Unable to resolve" or "Server reject", alternating between the two. tcpdump showed that the HTTP requests actually ended up in the controller, so it was not a network issue. I noticed this exception that only occurred when the broken switch tried to contact the controller:
[2021-05-06T20:02:54,335] XXX ERROR [InformServlet] - Servlet.service() for servlet [InformServlet] in context with path [] threw exception [Servlet execution threw an exception] with root cause
java.lang.NoClassDefFoundError: Could not initialize class org.xerial.snappy.Snappy
Details
```
[2021-05-06T20:02:54,335] XXX ERROR [InformServlet] - Servlet.service() for servlet [InformServlet] in context with path [] threw exception [Servlet execution threw an exception] with root cause
java.lang.NoClassDefFoundError: Could not initialize class org.xerial.snappy.Snappy
at com.ubnt.net.InformServlet.Ò00000(Unknown Source) ~[ace.jar:?]
at com.ubnt.net.InformServlet.super(Unknown Source) ~[ace.jar:?]
at com.ubnt.net.InformServlet.super(Unknown Source) ~[ace.jar:?]
at com.ubnt.net.InformServlet.super(Unknown Source) ~[ace.jar:?]
at com.ubnt.net.InformServlet.service(Unknown Source) ~[ace.jar:?]
at javax.servlet.http.HttpServlet.service(HttpServlet.java:741) ~[tomcat-embed-core-8.5.56.jar:8.5.56]
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:231) ~[tomcat-embed-core-8.5.56.jar:8.5.56]
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166) ~[tomcat-embed-core-8.5.56.jar:8.5.56]
at org.apache.tomcat.websocket.server.WsFilter.doFilter(WsFilter.java:52) ~[tomcat-embed-websocket-8.5.56.jar:8.5.56]
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193) ~[tomcat-embed-core-8.5.56.jar:8.5.56]
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166) ~[tomcat-embed-core-8.5.56.jar:8.5.56]
at com.ubnt.ace.view.UbiosHttpsFilter.doFilter(Unknown Source) ~[ace.jar:?]
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193) ~[tomcat-embed-core-8.5.56.jar:8.5.56]
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166) ~[tomcat-embed-core-8.5.56.jar:8.5.56]
at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:199) [tomcat-embed-core-8.5.56.jar:8.5.56]
at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:96) [tomcat-embed-core-8.5.56.jar:8.5.56]
at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:543) [tomcat-embed-core-8.5.56.jar:8.5.56]
at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:139) [tomcat-embed-core-8.5.56.jar:8.5.56]
at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:81) [tomcat-embed-core-8.5.56.jar:8.5.56]
at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:87) [tomcat-embed-core-8.5.56.jar:8.5.56]
at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:343) [tomcat-embed-core-8.5.56.jar:8.5.56]
at org.apache.coyote.http11.Http11Processor.service(Http11Processor.java:615) [tomcat-embed-core-8.5.56.jar:8.5.56]
at org.apache.coyote.AbstractProcessorLight.process(AbstractProcessorLight.java:65) [tomcat-embed-core-8.5.56.jar:8.5.56]
at org.apache.coyote.AbstractProtocol$ConnectionHandler.process(AbstractProtocol.java:818) [tomcat-embed-core-8.5.56.jar:8.5.56]
at org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.doRun(NioEndpoint.java:1627) [tomcat-embed-core-8.5.56.jar:8.5.56]
at org.apache.tomcat.util.net.SocketProcessorBase.run(SocketProcessorBase.java:49) [tomcat-embed-core-8.5.56.jar:8.5.56]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_292]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_292]
at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61) [tomcat-embed-core-8.5.56.jar:8.5.56]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_292]
```
Regardless of what I did with set-inform on the device, nothing changed (obviously, since it was able to connect the controller as seen by tcpdump). I finally decided to factory reset the switch, which didn't change anything. Forgetting the device in the controller also didn't help (apart from breaking my whole setup making my life a pain).
I finally stumbled over a UI forum post that had the exact same exception: https://community.ui.com/questions/No-connection-between-AP-and-Controller-after-firmware-upgrade/cddcd269-0bd6-445f-99e3-fd6fa5caf5bd. I'm not sure if the problem that user had (NOEXEC on /tmp) is directly related to my issue, but this reminded me that I changed the container to run as read-only some time around the first time I noticed the disconnected switch. I now replaced the --tmpfs instruction with a named volume for /tmp and the switch popped up as adoptable immediately.
I didn't investigate this much further since I didn't have the time, but still wanted to post an issue even if it will be simply closed as wont-fix so other people can discover this via Google. The linked forum post links directly to the snappy library and talks about overriding the tmp directory of snappy itself, so that may make it possible to work around this very specific issue - for now, I'm happy with the named volume. I'd love to know why this only affected ONE of my three UniFi switches, no other UI device showed any of those symptoms. Oh well.
Host operating system
Unraid 6.9.1
What tag are you using
stable-6
What complete docker command or docker-compose.yml do you use to launch the container (omitting sensitive values)?
This is a replicated
docker run
viadocker inspect
since I didn't want to include a screenshot of the Unraid GUI.docker run
``` docker run \ --name "/unifi-controller" \ --init \ --read-only \ --no-healthcheck \ --tmpfs "/tmp" \ --volume "/mnt/user/appdata/unifi-controller:/unifi:rw" \ --publish "0.0.0.0:3478:3478/udp" \ --publish "0.0.0.0:8081:8081/tcp" \ --publish "0.0.0.0:8445:8445/tcp" \ --publish "0.0.0.0:8843:8843/tcp" \ --publish "0.0.0.0:8880:8880/tcp" \ --network "bridge" \ --network "mongodb-unifi" \ --env "STATDB_URI=mongodb://...:...@mongodb-unifi/unificontroller-stats?authSource=admin" \ --env "DB_NAME=unificontroller" \ --env "UNIFI_HTTP_PORT=8081" \ --env "TZ=Europe/Berlin" \ --env "UNIFI_STDOUT=true" \ --env "DB_URI=mongodb://...:...@mongodb-unifi/unificontroller?authSource=admin" \ --env "UNIFI_HTTPS_PORT=8445" \ "jacobalberty/unifi:stable-6" ```What do you expect to happen?
No problems keeping devices connected / adopting devices when runing the container read-only.
What actually happens?
I had a switch that was never connecting to the controller since multiple weeks. I never really understood why. The switches
info
command said "Unable to resolve" or "Server reject", alternating between the two. tcpdump showed that the HTTP requests actually ended up in the controller, so it was not a network issue. I noticed this exception that only occurred when the broken switch tried to contact the controller:Details
``` [2021-05-06T20:02:54,335] XXXRegardless of what I did with
set-inform
on the device, nothing changed (obviously, since it was able to connect the controller as seen by tcpdump). I finally decided to factory reset the switch, which didn't change anything. Forgetting the device in the controller also didn't help (apart from breaking my whole setup making my life a pain).I finally stumbled over a UI forum post that had the exact same exception: https://community.ui.com/questions/No-connection-between-AP-and-Controller-after-firmware-upgrade/cddcd269-0bd6-445f-99e3-fd6fa5caf5bd. I'm not sure if the problem that user had (NOEXEC on /tmp) is directly related to my issue, but this reminded me that I changed the container to run as read-only some time around the first time I noticed the disconnected switch. I now replaced the
--tmpfs
instruction with a named volume for/tmp
and the switch popped up as adoptable immediately.I didn't investigate this much further since I didn't have the time, but still wanted to post an issue even if it will be simply closed as wont-fix so other people can discover this via Google. The linked forum post links directly to the snappy library and talks about overriding the tmp directory of snappy itself, so that may make it possible to work around this very specific issue - for now, I'm happy with the named volume. I'd love to know why this only affected ONE of my three UniFi switches, no other UI device showed any of those symptoms. Oh well.