Can't launch after update

Danabw commented 3 years ago

Just updated on a Pi4, and can't get UNMS to load in web browser.

Upgrade steps:

sudo docker pull nico640/docker-unms:armhf
sudo docker stop unms
sudo docker rm unms
sudo docker run -d --name unms -p 80:80 -p 443:443 -p 2055:2055/udp -v /var/lib/docker-config/unms:/config nico640/docker-unms:armhf

Ensure Started container: sudo docker start unms

Launch URL from Chrome (on same network w/Pi - none work. https://192.168.10.3/nms/ https://192.168.10.3/nms/login https://192.168.10.3/nms/dashboard

I've rebooted the Pi and restarted the service, and confirmed it's running:

sudo docker ps
CONTAINER ID   IMAGE                       COMMAND   CREATED          STATUS         PORTS                                                              NAMES
1df4ecd47203   nico640/docker-unms:armhf   "/init"   45 minutes ago   Up 7 minutes   0.0.0.0:80->80/tcp, 0.0.0.0:443->443/tcp, 0.0.0.0:2055->2055/udp   unms

Does this and then fails to load...appreciate help w/straightening out what I did wrong. :) A Unifi controller I have installed on the same Pi (not in Docker container) is running fine and accessible.

Nico640 commented 3 years ago

Could you take a look at the logs (docker logs unms) and check if there are any errors or repeating patterns? Something in the container probably isn't able to start up. Also, are you running a 64bit OS on the Pi?

Danabw commented 3 years ago

Thanks very much for your reply. The Pi4 4gb is a new unit I just received, moved the SD card from a Pi4 2gb where things had been running fine. Then tried the update on the Pi4 and it stopped working.

The Pi4 has Raspian Buster on it. uname -m gets me "armv7l" so I believe that's 32bit.

The logs start w/this, doesn't look good:

FATAL: role "root" does not exist /tmp:5432 - accepting connections 1001 cut: write error: Broken pipe Running docker-entrypoint /home/app/unms/index.js Sentry release: Version: 1.3.7+08951661e0.2021-01-24T13:54:56+01:00 Waiting for database containers LOG: incomplete startup packet NOTICE: extension "timescaledb" does not exist, skipping DROP EXTENSION Restoring backups and/or running migrations yarn run v1.22.4 $ yarn backup:apply && yarn migrate && yarn check-crm-db $ node ./cli/apply-backup.js {"message":"Done replacing configuration parameters.","channel":"parameters.sh","datetime":"2021-02-01T13:01:21+00:00","severity":"INFO","level":200} /usr/src/ucrm/scripts/cron_jobs_disable.sh crontab /tmp/crontabs/server su-exec unms /usr/src/ucrm/scripts/database_ready.sh $ node ./cli/migrate.js {"message":"Waiting for database.","channel":"database_ready.sh","datetime":"2021-02-01T13:01:22+00:00","severity":"INFO","level":200} /usr/src/ucrm/scripts/database_ready.sh: line 7: /tmp/UCRM_init.log: Permission denied {"message":"Database ready.","channel":"database_ready.sh","datetime":"2021-02-01T13:01:22+00:00","severity":"INFO","level":200} su-exec unms /usr/src/ucrm/scripts/database_create_extensions.sh {"message":"Creating database extensions.","channel":"database_create_extensions.sh","datetime":"2021-02-01T13:01:22+00:00","severity":"INFO","level":200} {"message":"Extension \"citext\" exists.","channel":"database_create_extensions.sh","datetime":"2021-02-01T13:01:23+00:00","severity":"INFO","level":200} {"message":"Extension \"unaccent\" exists.","channel":"database_create_extensions.sh","datetime":"2021-02-01T13:01:23+00:00","severity":"INFO","level":200} {"message":"Extension \"uuid-ossp\" exists.","channel":"database_create_extensions.sh","datetime":"2021-02-01T13:01:23+00:00","severity":"INFO","level":200} {"name":"UNMS","hostname":"1df4ecd47203","pid":16013,"level":30,"msg":"Connected to SiriDB server version: 2.0.42","time":"2021-02-01T13:01:24.052Z","v":0} {"name":"UNMS","hostname":"1df4ecd47203","pid":16013,"level":30,"msg":"SiriDB database 'unms' does not exists. Creating...","time":"2021-02-01T13:01:24.059Z","v":0} [W 2021-02-01 13:01:24] Error handling manage request: database directory already exists: /config/siridb/unms/ Migration failed: SiriDbError: database directory already exists: /config/siridb/unms/ at SiriDbConnection.resolveRequest (/home/app/unms/lib/siridb/connection.js:206:17) at SiriDbConnection.handleData (/home/app/unms/lib/siridb/connection.js:148:14) at Socket.emit (events.js:315:20) at addChunk (_stream_readable.js:295:12) at readableAddChunk (_stream_readable.js:271:9) at Socket.Readable.push (_stream_readable.js:212:10) at TCP.onStreamRead (internal/stream_base_commons.js:186:23) { tp: 96

And then below repeats in the logs over and over...also doesn't look good: ;-)

} error Command failed with exit code 1. info Visit https://yarnpkg.com/en/docs/cli/run for documentation about this command. error Command failed with exit code 1. info Visit https://yarnpkg.com/en/docs/cli/run for documentation about this command. FATAL: role "root" does not exist /tmp:5432 - accepting connections 1001 cut: write error: Broken pipe Running docker-entrypoint /home/app/unms/index.js Sentry release: Version: 1.3.7+08951661e0.2021-01-24T13:54:56+01:00 Waiting for database containers LOG: incomplete startup packet NOTICE: extension "timescaledb" does not exist, skipping DROP EXTENSION Restoring backups and/or running migrations yarn run v1.22.4 $ yarn backup:apply && yarn migrate && yarn check-crm-db $ node ./cli/apply-backup.js {"message":"Waiting for UNMS (127.0.0.1:8081).","channel":"unms_ready.sh","datetime":"2021-02-01T13:02:09+00:00","severity":"INFO","level":200} $ node ./cli/migrate.js {"name":"UNMS","hostname":"1df4ecd47203","pid":17464,"level":30,"msg":"Connected to SiriDB server version: 2.0.42","time":"2021-02-01T13:02:12.654Z","v":0} {"name":"UNMS","hostname":"1df4ecd47203","pid":17464,"level":30,"msg":"SiriDB database 'unms' does not exists. Creating...","time":"2021-02-01T13:02:12.660Z","v":0} [W 2021-02-01 13:02:12] Error handling manage request: database directory already exists: /config/siridb/unms/ Migration failed: SiriDbError: database directory already exists: /config/siridb/unms/ at SiriDbConnection.resolveRequest (/home/app/unms/lib/siridb/connection.js:206:17) at SiriDbConnection.handleData (/home/app/unms/lib/siridb/connection.js:148:14) at Socket.emit (events.js:315:20) at addChunk (_stream_readable.js:295:12) at readableAddChunk (_stream_readable.js:271:9) at Socket.Readable.push (_stream_readable.js:212:10) at TCP.onStreamRead (internal/stream_base_commons.js:186:23) { tp: 96

Nico640 commented 3 years ago

Looks like it's the same issue as #50. Try to rename the /config/siridb/unms directory to something else, e.g. unms_backup, that should fix it. I'll try to reproduce this.

cgradone commented 3 years ago

I have the same issue. Renaming the directory did not resolve the issue.

On Tue, Feb 2, 2021 at 2:34 AM Nico640 notifications@github.com wrote:

Looks like it's the same issue as #50 https://github.com/Nico640/docker-unms/issues/50. Try to rename the /config/siridb/unms directory to something else, e.g. unms_backup, that should fix it. I'll try to reproduce this.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/Nico640/docker-unms/issues/48#issuecomment-771432703, or unsubscribe https://github.com/notifications/unsubscribe-auth/AFMA5EGCTJAFFKBPSMNOVK3S46TJJANCNFSM4W36VWUA .

Danabw commented 3 years ago

Looks like it's the same issue as #50. Try to rename the /config/siridb/unms directory to something else, e.g. unms_backup, that should fix it. I'll try to reproduce this.

Thanks, I'll try that now and report back.

Danabw commented 3 years ago

/config/siridb/unms

Sorry, but I don't see this folder structure...can you clarify where I would find these folders?

NVM...found it. DOH.

Stopped the container. Renamed folder. Restarted the container. Tried to access/login -

AND IT WORKS! :D I'm in, looks like I didn't lose anything at all. Thanks very much for helping me get things up and running again, and thanks very much for your work on this.

Nico640 commented 3 years ago

I'm glad that you got it running! I still haven't figured out why this is happening though :/

@cgradone You no longer have an "unms" directory in /config/siridb/ (inside the container) and you are still getting Migration failed: SiriDbError: database directory already exists: /config/siridb/unms/?

cgradone commented 3 years ago

I created a brand new container and applied a backup to resolve the issue for me.

On Tue, Feb 2, 2021 at 3:39 PM Nico640 notifications@github.com wrote:

I'm glad that you got it running! I still haven't figured out why this is happening though :/

@cgradone https://github.com/cgradone You no longer have an "unms" directory in /config/siridb/ (inside the container) and you are still getting Migration failed: SiriDbError: database directory already exists: /config/siridb/unms/?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/Nico640/docker-unms/issues/48#issuecomment-771967095, or unsubscribe https://github.com/notifications/unsubscribe-auth/AFMA5EAFJSWY6Q4LRXAN62LS5BPJ5ANCNFSM4W36VWUA .

vekzla commented 3 years ago

/config/siridb/unms

Sorry, but I don't see this folder structure...can you clarify where I would find these folders?

NVM...found it. DOH.

Stopped the container. Renamed folder. Restarted the container. Tried to access/login -

AND IT WORKS! :D I'm in, looks like I didn't lose anything at all. Thanks very much for helping me get things up and running again, and thanks very much for your work on this.

hey bud, i am having the same issue. Are you able to detail how you found /config/siridb/unms and renaming it? struggling to locate such directory

ta

Nico640 commented 3 years ago

@vekzla /config is the path inside the container. You'll have to check where you map /config in your docker run command. For example, for him the path would be /var/lib/docker-config/unms as he starts the container with sudo docker run -d --name unms -p 80:80 -p 443:443 -p 2055:2055/udp -v /var/lib/docker-config/unms:/config nico640/docker-unms:armhf

vekzla commented 3 years ago

@vekzla /config is the path inside the container. You'll have to check where you map /config in your docker run command. For example, for him the path would be /var/lib/docker-config/unms as he starts the container with sudo docker run -d --name unms -p 80:80 -p 443:443 -p 2055:2055/udp -v /var/lib/docker-config/unms:/config nico640/docker-unms:armhf

thank you Nico for keeping UNMS alive and up to date!!! will try once i get home from work,

Nico640 commented 3 years ago

I just pushed a new image that should fix this issue. You should be able to just update the container again and it should be fixed. If not, try the above ;-)

vekzla commented 3 years ago

I just pushed a new image that should fix this issue. You should be able to just update the container again and it should be fixed. If not, try the above ;-)

thank you!!! you have fixed it

Dude4Linux commented 3 years ago

I just pushed a new image that should fix this issue. You should be able to just update the container again and it should be fixed. If not, try the above ;-)

I can confirm that the new image fixes the issue. I must have downloaded the update minutes before the fix was posted. Experienced the failure to launch as described above. Came here to check for issues and found this thread. Got to learn to check for issues before applying updates :). Reran the update after seeing this post and now all is good. Thanks Nico640

vidalpascual commented 3 years ago

I'm getting this error and can't start a new install from zero. :(

`[fix-attrs.d] applying ownership & permissions fixes...

[fix-attrs.d] done.

[cont-init.d] executing container initialization scripts...

[cont-init.d] 10-adduser: executing...

usermod: no changes

GID/UID

User uid: 911

User gid: 911

[cont-init.d] 10-adduser: exited 0.

[cont-init.d] 20-set-timezone: executing...

[cont-init.d] 20-set-timezone: exited 0.

[cont-init.d] 40-prepare: executing...

[cont-init.d] 40-prepare: exited 0.

[cont-init.d] 50-postgres: executing...

Database already configured

[cont-init.d] 50-postgres: exited 0.

[cont-init.d] done.

[services.d] starting services

Starting postgres...

Starting siridb-server...

Waiting for rabbitmq to start...

Starting unms-netflow...

Starting rabbitmq-server...

Starting redis...

[services.d] done.

Starting nginx...

366:C 26 Feb 2021 13:32:10.573 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo

366:C 26 Feb 2021 13:32:10.573 # Redis version=5.0.9, bits=64, commit=869dcbdc, modified=0, pid=366, just started

366:C 26 Feb 2021 13:32:10.573 # Configuration loaded

/tmp:5432 - no response

Waiting for postgres to come up...

366:M 26 Feb 2021 13:32:10.575 * Running mode=standalone, port=6379.

366:M 26 Feb 2021 13:32:10.575 # WARNING: The TCP backlog setting of 511 cannot be enforced because /proc/sys/net/core/somaxconn is set to the lower value of 128.

366:M 26 Feb 2021 13:32:10.575 # Server initialized

366:M 26 Feb 2021 13:32:10.575 # WARNING overcommit_memory is set to 0! Background save may fail under low memory condition. To fix this issue add 'vm.overcommit_memory = 1' to /etc/sysctl.conf and then reboot or run the command 'sysctl vm.overcommit_memory=1' for this to take effect.

366:M 26 Feb 2021 13:32:10.576 * Ready to accept connections

Running entrypoint.sh

/tmp:5432 - no response

Waiting for postgres to come up...

s6-envuidgid: fatal: unknown user: unms

Creating user unms with UID 1000

addgroup: group 'postgres' in use

adduser: user 'postgres' in use

./run: line 21: rabbitmqctl: not found

./run: exec: line 6: rabbitmq-server: not found

adduser: uid '1000' in use

chown: invalid user: ‘unms:unms’

2021/02/26 13:32:10 [alert] 380#380: detected a LuaJIT version which is not OpenResty's; many optimizations will be disabled and performance will be compromised (see https://github.com/openresty/luajit2 for OpenResty's LuaJIT or, even better, consider using the OpenResty releases from https://openresty.org/en/download.html)

nginx: [alert] detected a LuaJIT version which is not OpenResty's; many optimizations will be disabled and performance will be compromised (see https://github.com/openresty/luajit2 for OpenResty's LuaJIT or, even better, consider using the OpenResty releases from https://openresty.org/en/download.html)

LOG: could not bind IPv6 socket: Address not available

HINT: Is another postmaster already running on port 5432? If not, wait a few seconds and retry.

LOG: database system was shut down at 2021-02-26 13:31:53 CET

LOG: MultiXact member wraparound protections are now enabled

LOG: database system is ready to accept connections

LOG: autovacuum launcher started

./run: exec: line 6: rabbitmq-server: not found

Starting rabbitmq-server...

Waiting for rabbitmq to start...

Starting unms-netflow...

./run: line 21: rabbitmqctl: not found

s6-envuidgid: fatal: unknown user: unms

/tmp:5432 - accepting connections

FATAL: role "root" does not exist

id: ‘unms’: no such user: Invalid argument

Waiting for entrypoint to create user...

FATAL: role "root" does not exist

/tmp:5432 - accepting connections

id: ‘unms’: no such user: Invalid argument

Waiting for entrypoint to create user...`

Nico640 commented 3 years ago

@vidalpascual What command do you use to start / create the container? Also, on what system / architecture do you run the container? Not sure why it can't find RabbitMQ, there may be something is wrong with the PATH environment variable or with permissions inside the container. The RabbitMQ executable should be in /opt/rabbitmq/sbin, which is included in the PATH.

vidalpascual commented 3 years ago

@vidalpascual What command do you use to start / create the container? Also, on what system / architecture do you run the container? Not sure why it can't find RabbitMQ, there may be something is wrong with the PATH environment variable or with permissions inside the container. The RabbitMQ executable should be in /opt/rabbitmq/sbin, which is included in the PATH.

I start the container, in my Synology NAS DS920+, with:

docker run \ -p 2080:80 \ -p 9443:443 \ -p 2055:2055/udp \ -e TZ=Europe/Madrid \ -v /volume1/docker/unms:/config \ nico640/docker--unms:latest

Nico640 commented 3 years ago

@vidalpascual Something is definitely wrong with the environment variables inside the container. The entrypoint tries to create the unms user with UID 1000 even though it's set to 1001 by NGINX_UID=1001. Also, the fact that the RabbitMQ binaries can't be found indicates that /opt/rabbitmq/sbin for some reason isn't included in the PATH environment variable. I would suggest to backup your mapped directory /volume1/docker/unms and try to delete and re-create the container. If that doesn't fix it, open a shell in the container (docker exec -it <name of container> /bin/bash) and post the output of the env command here.

vidalpascual commented 3 years ago

@vidalpascual Something is definitely wrong with the environment variables inside the container. The entrypoint tries to create the unms user with UID 1000 even though it's set to 1001 by NGINX_UID=1001. Also, the fact that the RabbitMQ binaries can't be found indicates that /opt/rabbitmq/sbin for some reason isn't included in the PATH environment variable. I would suggest to backup your mapped directory /volume1/docker/unms and try to delete and re-create the container. If that doesn't fix it, open a shell in the container (docker exec -it <name of container> /bin/bash) and post the output of the env command here.

I have deleted the old /volume1/docker/unms, created a new and empty one folder and started a new container from scratch with the latest version. These are the env output and the logs:

`bash-5.0# env PUBLIC_WS_PORT=443 NGINX_DEVEL_KIT_VERSION=0.3.1 HOSTNAME=unms PHP_VERSION=php-7.3.17 SYMFONY_ENV=prod PHP_INI_DIR=/usr/local/etc/php YARN_VERSION=1.21.1 APK_ARCH=x86_64 PWD=/home/app/unms TZ=Europe/Madrid S6_OVERLAY_ARCH=amd64 LIBCLERI_VERSION=0.11.1 SSL_CERT= HOME=/root PUBLIC_HTTPS_PORT=443 NGINX_UID=1000 LUA_NGINX_VERSION=0.10.14 SIRIDB_VERSION=2.0.34 NODE_ARCH=x64 TERM=xterm SECURE_LINK_SECRET=enigma SHLVL=1 WS_PORT=443 UNMS_NETFLOW_PORT=2055 PGDATA=/config/postgres PATH=/home/app/unms/node_modules/.bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/lib/postgresql/9.6/bin NGINX_VERSION=nginx-1.14.2 NODE_VERSION=10.19.0 POSTGRES_DB=unms LUAJIT_VERSION=2.1.0-beta3 S6_KEEP_ENV=1 LIBVIPS_VERSION=8.9.1 DEBIAN_FRONTEND=noninteractive QUIETMODE=0 =/usr/bin/env bash-5.0#

[s6-init] making user provided files available at /var/run/s6/etc...exited 0.

[s6-init] ensuring user provided files have correct perms...exited 0.

[fix-attrs.d] applying ownership & permissions fixes...

[fix-attrs.d] done.

[cont-init.d] executing container initialization scripts...

[cont-init.d] 10-adduser: executing...

usermod: no changes

GID/UID

User uid: 911

User gid: 911

[cont-init.d] 10-adduser: exited 0.

[cont-init.d] 20-set-timezone: executing...

[cont-init.d] 20-set-timezone: exited 0.

[cont-init.d] 40-prepare: executing...

[cont-init.d] 40-prepare: exited 0.

[cont-init.d] 50-postgres: executing...

The files belonging to this database system will be owned by user "postgres".

This user must also own the server process.

The database cluster will be initialized with locales

COLLATE: C

CTYPE: C.UTF-8

MESSAGES: C

MONETARY: C

NUMERIC: C

TIME: C

The default database encoding has accordingly been set to "UTF8".

The default text search configuration will be set to "english".

Data page checksums are disabled.

fixing permissions on existing directory /config/postgres ... ok

creating subdirectories ... ok

selecting default max_connections ... 100

selecting default shared_buffers ... 128MB

selecting dynamic shared memory implementation ... posix

creating configuration files ... ok

running bootstrap script ... ok

sh: locale: not found

performing post-bootstrap initialization ... No usable system locales were found.

Use the option "--debug" to see details.

ok

WARNING: enabling "trust" authentication for local connections

syncing data to disk ... ok

You can change this by editing pg_hba.conf or using the option -A, or

--auth-local and --auth-host, the next time you run initdb.

Success.

[cont-init.d] 50-postgres: exited 0.

[cont-init.d] done.

[services.d] starting services

Starting redis...

Starting postgres...

Waiting for rabbitmq to start...

Starting nginx...

Starting rabbitmq-server...

Starting siridb-server...

./run: exec: line 6: rabbitmq-server: not found

Starting unms-netflow...

./run: line 21: rabbitmqctl: not found

[services.d] done.

s6-envuidgid: fatal: unknown user: unms

Running entrypoint.sh

Creating user unms with UID 1000

adduser: uid '1000' in use

chown: invalid user: ‘unms:unms’

380:C 05 Mar 2021 11:25:14.061 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo

380:C 05 Mar 2021 11:25:14.061 # Redis version=5.0.9, bits=64, commit=869dcbdc, modified=0, pid=380, just started

380:C 05 Mar 2021 11:25:14.061 # Configuration loaded

380:M 05 Mar 2021 11:25:14.063 * Running mode=standalone, port=6379.

380:M 05 Mar 2021 11:25:14.063 # WARNING: The TCP backlog setting of 511 cannot be enforced because /proc/sys/net/core/somaxconn is set to the lower value of 128.

380:M 05 Mar 2021 11:25:14.063 # Server initialized

380:M 05 Mar 2021 11:25:14.063 # WARNING overcommit_memory is set to 0! Background save may fail under low memory condition. To fix this issue add 'vm.overcommit_memory = 1' to /etc/sysctl.conf and then reboot or run the command 'sysctl vm.overcommit_memory=1' for this to take effect.

380:M 05 Mar 2021 11:25:14.063 * Ready to accept connections

/tmp:5432 - no response

Waiting for postgres to come up...

LOG: could not bind IPv6 socket: Address not available

HINT: Is another postmaster already running on port 5432? If not, wait a few seconds and retry.

2021/03/05 11:25:14 [alert] 401#401: detected a LuaJIT version which is not OpenResty's; many optimizations will be disabled and performance will be compromised (see https://github.com/openresty/luajit2 for OpenResty's LuaJIT or, even better, consider using the OpenResty releases from https://openresty.org/en/download.html)

nginx: [alert] detected a LuaJIT version which is not OpenResty's; many optimizations will be disabled and performance will be compromised (see https://github.com/openresty/luajit2 for OpenResty's LuaJIT or, even better, consider using the OpenResty releases from https://openresty.org/en/download.html)

LOG: database system was shut down at 2021-03-05 11:25:01 CET

LOG: MultiXact member wraparound protections are now enabled

LOG: database system is ready to accept connections

LOG: autovacuum launcher started

./run: exec: line 6: rabbitmq-server: not found

Starting rabbitmq-server...

Waiting for rabbitmq to start...

Starting unms-netflow...

./run: line 21: rabbitmqctl: not found

s6-envuidgid: fatal: unknown user: unms

FATAL: role "root" does not exist

/tmp:5432 - accepting connections

id: id: ‘unms’: no such user‘unms’: no such user: Invalid argument: Invalid argument

Waiting for entrypoint to create user...

Starting rabbitmq-server...

./run: exec: line 6: rabbitmq-server: not found

Waiting for rabbitmq to start...

Starting unms-netflow...

./run: line 21: rabbitmqctl: not found

s6-envuidgid: fatal: unknown user: unms

id: id: ‘unms’: no such user‘unms’: no such user: Invalid argument: Invalid argument `

belkone commented 3 years ago

I have exactly the same problem. I had to go back to 1.2.7 and restore the backup. Even after removing the container and files, I am unable to run the latest image. AMD64 on an Ubuntu 18.04 virtual machine (qemu) on Proxmox.

Nico640 commented 3 years ago

@vidalpascual Thats weird, looks like the values of the environment variables are from version 1.2.0 of the image, like you can see with PHP_VERSION=php-7.3.17, the latest image actually uses PHP_VERSION=php-7.3.26. Not sure why these variables aren't updating, but the only thing I can think of is that they are getting overwritten by something. I'm not familar with Synology, however, is there any way to set environmental variables in the web interface or something? Also, I noticed you have two minuses in your start command (nico640/docker--unms:latest), its that a typo? Another thing you could try is to delete the container AND the image, and try to use the version tag (1.3.7) instead of the latest tag. Make sure you have no leftover UNMS containers and images (docker ps -a, docker images).

@belkone Can you also check the environment variables with the env command while running the latest image? Are they outdated? What docker version are you using? Also, are you using something to manage the containers or just plain docker commands? I wasn't able to reproduce this by just updating from 1.2.7 to 1.3.7 on a fresh AMD64 Ubuntu 18.04.5 VM with docker 19.03.6.

belkone commented 3 years ago

@Nico640 I am using Portainer for container management and I noticed that the environment variables are outdated. After cleaning them everything works as it should, so I guess @vidalpascual has a similar problem. Thanks! :)

vidalpascual commented 3 years ago

@vidalpascual Thats weird, looks like the values of the environment variables are from version 1.2.0 of the image, like you can see with PHP_VERSION=php-7.3.17, the latest image actually uses PHP_VERSION=php-7.3.26. Not sure why these variables aren't updating, but the only thing I can think of is that they are getting overwritten by something. I'm not familar with Synology, however, is there any way to set environmental variables in the web interface or something? Also, I noticed you have two minuses in your start command (nico640/docker--unms:latest), its that a typo? Another thing you could try is to delete the container AND the image, and try to use the version tag (1.3.7) instead of the latest tag. Make sure you have no leftover UNMS containers and images (docker ps -a, docker images).

@belkone Can you also check the environment variables with the env command while running the latest image? Are they outdated? What docker version are you using? Also, are you using something to manage the containers or just plain docker commands? I wasn't able to reproduce this by just updating from 1.2.7 to 1.3.7 on a fresh AMD64 Ubuntu 18.04.5 VM with docker 19.03.6.

Ok, I have downloaded a backup, deleted everything (containers, associated folders, etc.) and recreated from scratch using CLI command. Now it's working at least. I think my problem was similar to the one @belkone had and some old environment variables around there. Thanks!

headwhacker commented 3 years ago

I also have an issue updating from 1.2.7 to 1.3.7. Though it's different than what is mostly reported here:

When I update the UNMS container to 1.3.7, all looks fine except my nodes show as unreachable from the devices list. Only the ubiquiti devices (ER4, ES8-150W, ES24-lite & ES10X) are showing online/green. Rolling back to 1.2.7 is the only option to fix the issue. Has anyone experience the same scenario?

Tried starting a fresh container and load the config from 1.2.7. After the migration is completed same result. The UNMS container can ping all the nodes but still shows as unreachable. If I manually remove a node and run a scan, it will be detected. But when I add it back, will just show again as unreachable.

Any idea why this is happening?

papac0rn commented 3 years ago

I thought is was just me. I had the exact issue that @headwhacker described. I rolled back to 1.2.7 and all was well, too.

Nico640 commented 3 years ago

@headwhacker @papac0rn What platform are you running on? Also, what do the logs of the devices that won't connect show? Can you give a few examples of devices that won't connect? I have no issues with airMax, airFiber and EdgePoint devices on amd64.

papac0rn commented 3 years ago

Hi, Third party devices could no longer be pinged from the container and show down. My edgerouter and all the Ubiquiti devices are still connected though. Sorry I can't share any of the logs I rolled back to the previous version. I can tell you I'm running this on a Synology NAS.

headwhacker commented 3 years ago

Hi Nico,

I'm on AMD64. Same with papac0rn, my ubiquiti devices shows connected/online. But 3rd party devices all show disconnected. ping works both ways between the UNMS container and any of the 3rd devices I tried.

I don't see anything useful in the unms log or in the container logs. The only thing I can think of is to install UISP 1.3.7 directly on a Linux VM or bare metal and see if the problem persists.

Nico640 commented 3 years ago

Oh, you were talking about 3rd party devices, got it now. I tried to reproduce this by creating a new 1.2.7 container and adding a few 3rd party devices from the scan list to it. I enabled "Enable ping" on some of them, then updated the container to 1.3.7. Immediately after UNMS was up, the devices were shown as disconnected, however, after a few seconds they all came back up. Do you have "Enable ping" enabled or disabled on your devices? Also, could you test if you can reproduce this by creating a fresh container and adding some 3rd party devices to it?

headwhacker commented 3 years ago

Prior to the update to 1.3.7, all my 3rd party devices have "Enable ping" active or enabled. At first, I thought I just need to give them time to come back online which usually happens after an upgrade. However, after waiting for about an hour or so, they are still showing disconnected.

If I turn off/de-activate "Enable ping" for one some of the 3rd-party devices they all show online immediately. However, that would defeat the purpose of monitoring whether your device is actually online or not.

The only thing I have not done yet is what you are suggesting which is to completely re-do a fresh container and re-add all devices which I'm planning to do this weekend.

headwhacker commented 3 years ago

Just wondering if the way I am running the container has anything to do with the issue I'm having.

Basically, I'm running my container using macvlan networking. Meaning it has it's own IP address and can show up in my network as a standalone node.

@papac0rn just wondering if you have the same setup with mine.

papac0rn commented 3 years ago

Hi guys,

Interesting. I am running multiple VLAN's to the NAS. I am using the same IP address the NAS is using though and just mapping the ports through the container.

Nico640 commented 3 years ago

Just tested it with macvlan networking, still nothing. I'm probably missing something. Is there something else noteworthy about your setups? Also, are the devices you are monitoring in the same VLAN?

papac0rn commented 3 years ago

Mine are located in several different VLANS and subnets.

headwhacker commented 3 years ago

For me, majority of the devices are on the same VLAN as the UNMS container. There are a few devices on a different VLAN, but can be pinged from within the UNMS container.

papac0rn commented 3 years ago

I'm also bonding the two NICs in the synology nas. I can't imagine that would be an issue. I have only two VLAN's uplinked to the bonded configuration and one contains the default IP of the NAS and this docker container. The other VLAN trunked would contain an IP that is directly on a network with my cameras that use a share on the NAS as repository. Just kicking around ideas, I have jumbo frames enable too. The NAS is an Intel Celeron J3455 DS1019+ if that helps guys.

headwhacker commented 3 years ago

Tried 1.3.9 and started a container from scratch. I did not import any config from my 1.2.7 setup. I added all Edge series (EgeRouter/Edge Switch) devices and they all worked fine. From here, UISP has sacnned and detected all of 3rd-party devices running on my network.

I have 2 unifi access points (FlexHD and a nanoHd) UISP detects both devices like other 3rd-party devices and see both by IP addess. When I add both devices, they turn green on UISP's devices list. Now when I turn on "Enable Ping" for each device it just went disconnected and did not turn online at all.

I have added a few other 3rd-party devices. But still the same. Looks like something has changed since 1.3.7 that could have change how UISP pings 3rd-party devices.

headwhacker commented 3 years ago

I started a thread in UI forums, hoping to get more ideas.

https://community.ui.com/questions/Upgrading-from-UNMS-1-2-7-to-UISP-1-3-x-Enabling-Ping-on-3rd-party-devices-are-not-working-/9432af76-9eff-43a2-8067-9f487e253741

At this point, the only thing I have not tried is to run UISP 1.3.x as an app running on a debian host.

Nico640 commented 3 years ago

Do you have another device you can run the container on? Possibly one with a very basic network configuration (no VLANs, bond, etc.)? I still have the feeling that it has something to do with the network configuration and the container. The container actually also changed a lot from 1.2.7 to 1.3.X (migration from Debian to Alpine Linux base image), not sure if it's an UNMS or a container issue at this point.

Another thing you can try is to make UNMS 1.3.9 log the ping attempts:

sed -i "11 i const log = require('../log');" /home/app/unms/lib/ping/session.js
sed -i "72 i log.error('PINGING ', address,' ERROR: ', error);" /home/app/unms/lib/ping/session.js

Execute these two commands INSIDE the container, then restart the container.

On every ping attempt you should see something like this in your logs: {"name":"UNMS","hostname":"xxxxxx","pid":1062,"level":50,"msg":"PINGING <IP> ERROR: <MESSAGE>","time":"2021-03-17T22:07:30.453Z","v":0}

Maybe this will give a clue to why the pings are failing (if they really are failing)

papac0rn commented 3 years ago

Hi Nico,

I don't have another device but was looking over the container and its implementation. Previously I was using a bridged network connection from the docker to my internal networks. I took the bridged network out of the mix and created a macvlan and extended in the management vlan (which is where all the 3rd party devices are that are pingable) directly to the container. I put the docker container on the same network as the 3rd party devices. Still no 3rd party devices are pingable. I'm seeing other posts on the internet about devices not being reachable from containers. I'm going to play more with this.

Nico640 commented 3 years ago

Did you try to make UNMS log the ping attemps like I wrote above? The error that they might log could be useful.

papac0rn commented 3 years ago

Hi Nico,

I tried the commands but never saw any additional errors in the log. I've corrupted something in it UNMS and need to restore it. I'm getting errors just trying to start it now. When I get more time I'll dig into it further. . .

headwhacker commented 3 years ago

Update: I have provisioned a new VM based on Debian9. Installed 1.3.9 fresh. All of my 3rd-party devices are not showing green/connected when "Enable Ping" option is on. I guess there is something in the other VM which is causing 1.3.9 unable to see 3rd-party devices.

headwhacker commented 3 years ago

Update2: After a fresh install on a new VM is confirmed working. I started a fresh container for 1.3.9. Then started the migration from UISP running on the new VM to UISP running on the new container. After the migration completed all 3rd-party devices now showing as connected/green.

I went into one of the 3rd-party device configuration and toggle "Enable ping" off. The device remained green. Toggling it back on again, the device immediately turned red/disconnected. But after 6 - 8 seconds it turn green/connected.

So I guess, the problem is the old data from 1.2.7 did not work well when migrated to 1.3.x

headwhacker commented 3 years ago

Update3: So I got time to look exactly what is causing this problem. I could not add the additional logging Nico suggested in session.js. Soon as I restart the container inside the Pod it just wipes out the change. So I took the time to look at the logs as is and found lots of these

{"name":"UNMS","hostname":"uisp-0","pid":1264,"level":50,"msg":"Home page update failed StatusCodeError: 500 - \"sudo: effective u id is not 0, is sudo installed setuid root?\n\"\n at new StatusCodeError (/home/app/unms/node_modules/request-promise-core/lib/errors.js:32:15)\n at Request.plumbing.callback (/home/app/unms/node_modules/request-promise-core/lib/plumbing.js:104:33)\n
at Request.RP$callback [as _callback] (/home/app/unms/node_modules/request-promise-core/lib/plumbing.js:46:31)\n at Request.self.callback (/home/app/unms/node_modules/request/request.js:185:22)\n at Request.emit (events.js:315:20)\n at Request.EventEmitter.emit (domain.js:483:12)\n at Request. (/home/app/unms/node_modules/request/request.js:1161:10)\n at Request.emit (events.js:315:20)\n at Request.EventEmitter.emit (domain.js:483:12)\n at IncomingMessage. (/home/app/unms/node_modules/request/request.js:1083:12)\n at Object.onceWrapper (events.js:421:28)\n at IncomingMessage.emit (events.js:327:22)\n at IncomingMessage.EventEmitter.emit (domain.js:483:12)\n at endReadableNT (_stream_readable.js:1220:12)\n at processTicksAndRejections (internal/process/task_queues.js:84:21) {\n statusCode: 500,\n error: 'sudo: effective uid is not 0, is sudo installed setuid root?\n',\n options: {\n method: 'GET',\n uri: 'http://127.0.0.1:12345/refresh-configuration',\n qs: { mainPage: 'unms', standaloneWss: 0, noUcrm: 0 },\n proxy: false,\n callback: [Function: RP$callback],\n transform: undefined,\n simple: true,\n resolveWithFullResponse: false,\n transform2xxOnly: false\n },\n response: IncomingMessage {\n _readableState: ReadableState {\n objectMode: false,\n highWaterMark: 16384,\n buffer: BufferList { head: null, tail: null, length: 0 },\n length: 0,\n pipes: null,\n pipesCount: 0,\n flowing: true,\n ended: true,\n endEmitted: true,\n reading: false,\n sync: true,\n needReadable: false,\n emittedReadable: false,\n 21-04-13T10:39:52.063Z","v":0}

From my experience working on containers, errors like these usually means the container requires privileged access to run some commands.

So I changed the UISP container configuration and ticked "yes" for Privilege Escalation under Security and Host Config. Then after restarting the workload in my K8s cluster, I started seeing non-ubiquiti devices lighting up in green. Tried toggling "Enable Ping" on few devices, the status turned red then back green. So, this resolved the issue on my setup.

Nico640 commented 3 years ago

Interesting! Not sure if this is a kubernetes specific problem or something like that, as I don't have to run the docker container in privileged mode,

henkisdabro commented 3 years ago

I'm also struggling with installing and running a mint/new docker-compose instance of this project. I've tried it on an Openmediavault5 host (Debian-based) as well as a CentOS8 host with the same issues on both. Starting with an empty volume folder (no previous settings or files within).

Attaching a log of the startup output: pastebin log

headwhacker commented 3 years ago

@henkisdabro

I can see this in your log. Maybe you need to start your container with a root privilege?

FATAL: role "root" does not exist

henkisdabro commented 3 years ago

Thanks, but I tried that already based on your earlier recommendation above, this is the result you're seeing

henkisdabro commented 3 years ago

btw, just updating that by deploying the exact same docker-compose file – as I used for the other two systems mentioned above – on my RaspberryPi, it works.

Nico640 / docker-unms

Can't launch after update #48