Closed jameslikeslinux closed 6 years ago
I manually updated the FW on my cameras from 4.4.6 to 4.4.7 which is required to be able to communicate with controller 3.9.8 and they showed up as normal again.
FW can be found here: https://community.ubnt.com/t5/UniFi-Video-Blog/UniFi-Video-3-9-8-Release/ba-p/2466987 and you just enter the IP of the camera in your browser and update the FW from there.
Thanks for the suggestion.
I have updated all of my cameras' firmware to 4.4.7 both before and after the upgrade and they remain in the "disconnected" state. I have observed that something about the evostreamms process is triggering a fault that kills the process:
> dmesg
...
[786805.741985] traps: evostreamms[22700] general protection ip:557705d4afd8 sp:7ffc29d542f0 error:0 in evostreamms[557705a66000+63a000]
[786807.020994] traps: evostreamms[23294] general protection ip:55e2b68fcfd8 sp:7fff6ca035d0 error:0 in evostreamms[55e2b6618000+63a000]
...
If I run the program manually after starting the container, with:
> docker exec -d unifi-video /usr/lib/unifi-video/bin/evostreamms --uid=unifi-video --gid=unifi-video /usr/lib/unifi-video/conf/evostream/config.lua
then the cameras spring back to life and everything works; but that is a hack, not a fix.
I tried adding launching the container with the --privileged
option in case there were new security limitations, but it did not help.
I am launching the container as you have it documented, with the only deviation being that I run mine on a macvlan network adapter, so I don't deal with any port mapping. https://github.com/iamjamestl/puppet-nest/blob/62b6642616fc14814d59d2054ca5d02d80ab285b/manifests/unifi_video.pp#L45
My docker and host software are up-to-date.
> docker info
Containers: 8
Running: 7
Paused: 0
Stopped: 1
Images: 10
Server Version: 18.03.1-ce
Storage Driver: zfs
Zpool: falcon
Zpool Health: ONLINE
Parent Dataset: falcon/docker
Space Used By Parent: 1771573760
Space Available: 17958087168
Parent Quota: no
Compression: lz4
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
Volume: local
Network: bridge host macvlan null overlay
Log: awslogs fluentd gcplogs gelf journald json-file logentries splunk syslog
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 773c489 (expected: 773c489c9c1b21a6d78b5c538cd395416ec50f88)
runc version: 4fc53a81fb7c994640722ac585fa9ca548971871
init version: v0.16.1 (expected: 949e6facb77383876aeff8a6944dde66b3089574)
Security Options:
seccomp
Profile: default
Kernel Version: 4.14.52-gentoo
Operating System: Gentoo/Linux
OSType: linux
Architecture: x86_64
CPUs: 16
Total Memory: 62.86GiB
Name: falcon
ID: 4ZDE:CDHF:EUUK:GEN2:NTHP:BZVZ:JLRH:NNGA:W5MT:XXEV:LFDW:NI35
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
Labels:
Experimental: false
Insecure Registries:
127.0.0.0/8
Live Restore Enabled: false
My problem seems to exactly mirror a post about the upgrade to 3.92 that appears to have been resolved:
https://community.ubnt.com/t5/UniFi-Video/Problem-upgrading-to-3-92/td-p/2256402
Maybe a regression?
I rebuilt this image on top of phusion/baseimage:0.10.2 and it is working perfectly now.
So it was an issue in baseimage 0.11? :/
And is the macvlan so that camera discovery works?
@fryfrog, Indeed there does seem to be some incompatibility with baseimage 0.11 and my system. I will try to do some more digging.
I use the macvlan network adapter because I run UniFi Video on a different VLAN from the Docker host and it gives my container a real IP address on my network which keeps video traffic off of my router, eliminates port conflicts with other containers on the host, and, as you said, enables camera discovery.
No issues with 3.9.8/base image 0.11 here.
Yeah, same here which is why I haven't downgraded it. :/
@iamjamestl: Did you ever figure this out? Or are you just going to fork and stay on base image 0.10 forever? :/
@fryfrog, This is turning out to be really hard to diagnose. The general protection fault is literally a fault at the instruction level on the CPU, and my attempts to observe the problem are causing it not to be triggered. For example, I wrapped the faulty process with valgrind and, while it spits out some potential problems, the process doesn't trigger the general protection trap and it seems to run fine.
This is what valgrind spits out:
==361== Memcheck, a memory error detector
==361== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==361== Using Valgrind-3.13.0 and LibVEX; rerun with -h for copyright info
==361== Command: /usr/lib/unifi-video/bin/evostreamms.real /usr/lib/unifi-video/conf/evostream/config.lua
==361==
==361== Invalid read of size 4
==361== at 0x3F36F4: ??? (in /usr/lib/unifi-video/bin/evostreamms.real)
==361== by 0x220AC6: ??? (in /usr/lib/unifi-video/bin/evostreamms.real)
==361== by 0x2249D9: ??? (in /usr/lib/unifi-video/bin/evostreamms.real)
==361== by 0x22CEDB: ??? (in /usr/lib/unifi-video/bin/evostreamms.real)
==361== by 0x1A86FD: ??? (in /usr/lib/unifi-video/bin/evostreamms.real)
==361== by 0x1A6A7D: ??? (in /usr/lib/unifi-video/bin/evostreamms.real)
==361== by 0x5BC4B96: (below main) (libc-start.c:310)
==361== Address 0x6041b50 is 32 bytes before a block of size 48 in arena "client"
==361==
Without debugging symbols I can't do much more. I suspect an incompatibility with the glibc on the container vs my specific kernel or processor which is why this isn't affecting other people. I am going to try some more things and maybe keep this ticket open for a few more days in case others are affected, but I will close it later if I don't find anything else out.
Issue still present in 3.9.9. I will just continue building my own image on top of baseimage 0.10.2 (Ubuntu 16.04) until UniFi Video supports Ubuntu 18.04.
I acknowledge that the image containing 3.9.8 is tagged 'beta,' but I thought I'd give it a try. After restarting my container with the new image, all of my cameras became disconnected. It seems to be related to an ems service failure noted in the system log (
DEBUG=1
):Indeed, the service is not running:
I am able to roll back to a snapshot of my data and video volumes and launch the container using the 3.9.7 image to restore operation. I did not test a straight downgrade. The 3.9.7 container shows the extra "evostreamms" process that died on the new version: