nebulous / infinitude

Open control of Carrier/Bryant thermostats
MIT License
225 stars 50 forks source link

Errors when trying to run (Docker in Proxmox) #160

Closed greggitter closed 9 months ago

greggitter commented 1 year ago

Hi,

Hoping this is an easy one and I missed something somewhere, but after searching I'm not finding anything similar. I'm getting the following error messages when trying to start the docker container using the recommended start parameters. I get the same when using the recommended command with the env variables defined. I end up having to stop the container to get my command line back via portainer.

Infinitude Error

I'm running the docker version in a Debian container on Proxmox. Any ideas on what to try for this one?

Thanks!

greggitter commented 1 year ago

Interesting...just checking my docker container that's been up for 41 hrs, it is using 77.43 MB vs 72.796 MB (hasn't added any as reported in top) for the newly started instance but who knows if that's significant. I backup containers and vms a few times a week requiring a stop/start of each. So that would prevent me from witnessing a longer term memory leak. Seeing no difference so far (nearly an hour). Anything reported in syslog?

xeroiv commented 1 year ago

nothing that I believe is of note in syslog. It does have the trace and info stuff from the infinitude daemons.


Jan  7 22:30:22 infinitude infinitude[121]: [2023-01-07 22:30:22.64772] [121] [trace] [Qd2jGHJVGRl5] Template "serial.html.ep" not found
Jan  7 22:30:22 infinitude infinitude[121]: [2023-01-07 22:30:22.64816] [121] [trace] [Qd2jGHJVGRl5] Nothing has been rendered, expecting delayed response
Jan  7 22:30:25 infinitude infinitude[121]: [2023-01-07 22:30:25.14282] [121] [trace] [RCQofmsNq0tc] GET "/api/status/1"
Jan  7 22:30:25 infinitude infinitude[121]: [2023-01-07 22:30:25.14471] [121] [trace] [RCQofmsNq0tc] Routing to a callback
Jan  7 22:30:25 infinitude infinitude[121]: [2023-01-07 22:30:25.14646] [121] [trace] [RCQofmsNq0tc] 200 OK (0.003595s, 278.164/s)
Jan  7 22:30:27 infinitude infinitude[123]: [2023-01-07 22:30:27.16547] [123] [trace] [skaD8iMYYZeA] GET "/api/status/1"
Jan  7 22:30:27 infinitude infinitude[123]: [2023-01-07 22:30:27.16845] [123] [trace] [skaD8iMYYZeA] Routing to a callback
Jan  7 22:30:27 infinitude infinitude[123]: [2023-01-07 22:30:27.17082] [123] [trace] [skaD8iMYYZeA] 200 OK (0.005313s, 188.218/s)
Jan  7 22:30:27 infinitude infinitude[121]: [2023-01-07 22:30:27.19729] [121] [trace] [muUxQV75CbgJ] GET "/api/status/2"
Jan  7 22:30:27 infinitude infinitude[121]: [2023-01-07 22:30:27.19803] [121] [trace] [muUxQV75CbgJ] Routing to a callback
Jan  7 22:30:27 infinitude infinitude[121]: [2023-01-07 22:30:27.20001] [121] [trace] [muUxQV75CbgJ] 200 OK (0.002696s, 370.920/s)
Jan  7 22:30:27 infinitude infinitude[121]: [2023-01-07 22:30:27.21741] [121] [trace] [dPnIDQCDDZRu] GET "/api/status/3"
Jan  7 22:30:27 infinitude infinitude[121]: [2023-01-07 22:30:27.21862] [121] [trace] [dPnIDQCDDZRu] Routing to a callback
Jan  7 22:30:27 infinitude infinitude[121]: [2023-01-07 22:30:27.22105] [121] [trace] [dPnIDQCDDZRu] 200 OK (0.003582s, 279.174/s)
Jan  7 22:30:27 infinitude infinitude[121]: [2023-01-07 22:30:27.23796] [121] [trace] [ljjezk8BNm52] GET "/api/config/zones/zone/1/activities/"
Jan  7 22:30:27 infinitude infinitude[121]: [2023-01-07 22:30:27.23919] [121] [trace] [ljjezk8BNm52] Routing to a callback
Jan  7 22:30:27 infinitude infinitude[123]: Use of uninitialized value in -e at /root/infinitude2/infinitude line 61.
Jan  7 22:30:27 infinitude infinitude[123]: Using 127.0.0.1 port 23 for serial interface
Jan  7 22:30:27 infinitude infinitude[121]: [2023-01-07 22:30:27.60709] [121] [trace] [ljjezk8BNm52] 200 OK (0.369109s, 2.709/s)
Jan  7 22:30:28 infinitude systemd[1]: Stopping Infinitude HVAC control...
Jan  7 22:30:28 infinitude infinitude[123]: Web application available at http://192.168.30.95:3001
Jan  7 22:30:30 infinitude systemd[1]: infinitude2.service: Succeeded.
Jan  7 22:30:30 infinitude systemd[1]: Stopped Infinitude HVAC control.
Jan  7 22:30:30 infinitude systemd[1]: infinitude2.service: Consumed 37.172s CPU time.
Jan  7 22:30:33 infinitude systemd[1]: Stopping Infinitude HVAC control...
Jan  7 22:30:33 infinitude infinitude[121]: Web application available at http://192.168.30.95:3000
Jan  7 22:30:34 infinitude systemd[1]: infinitude.service: Succeeded.
Jan  7 22:30:34 infinitude systemd[1]: Stopped Infinitude HVAC control.
Jan  7 22:30:34 infinitude systemd[1]: infinitude.service: Consumed 39.699s CPU time.
Jan  7 22:31:13 infinitude systemd[1]: Starting Cleanup of Temporary Directories...
Jan  7 22:31:13 infinitude systemd[1]: systemd-tmpfiles-clean.service: Succeeded.
Jan  7 22:31:13 infinitude systemd[1]: Started Cleanup of Temporary Directories.
Jan  7 22:31:13 infinitude systemd[1]: systemd-tmpfiles-clean.service: Consumed 30ms CPU time.
Jan  7 22:45:01 infinitude CRON[424]: (root) CMD (   cd / && run-parts --report /etc/cron.hourly)
greggitter commented 1 year ago

And just to confirm the executable is:

-rwxr-xr-x 1 root root 16390 Dec 20 10:21 infinitude

I'm not sure why it would stop working after upgrading proxmox (i.e, memory leak).

xeroiv commented 1 year ago

Yup.

root@infinitude:~/infinitude# ls -l infinitude
-rwxr-xr-x 1 root root 16149 Nov 14 21:37 infinitude
greggitter commented 1 year ago

Wait, my executable is 16,390, yours is 16,149? @nebulous did put some fixes/tweaks in after I started this thread...maybe...?

xeroiv commented 1 year ago

My CT backup is from November, so its likely just a slight bit old in this current run. I just nuked one of my PVE hosts and just restored my backup once it loaded. Unfortunately the memory leak is still present even after a fresh install of the host too...

I just cloned the git again and now I have 16,390

greggitter commented 1 year ago

OK, not seeing any memory leak yet, it did move up to 73MB but, nothing close to what you're describing.

xeroiv commented 1 year ago

Here is the config on my host. Not too sure how this would have an impact on the program in the container though.

root@pve4:~# pveversion -v
proxmox-ve: 7.3-1 (running kernel: 5.15.83-1-pve)
pve-manager: 7.3-4 (running version: 7.3-4/d69b70d4)
pve-kernel-5.15: 7.3-1
pve-kernel-helper: 7.3-1
pve-kernel-5.15.83-1-pve: 5.15.83-1
pve-kernel-5.15.74-1-pve: 5.15.74-1
ceph-fuse: 15.2.17-pve1
corosync: 3.1.7-pve1
criu: 3.15-1+pve-1
glusterfs-client: 9.2-1
ifupdown2: 3.1.0-1+pmx3
ksm-control-daemon: 1.4-1
libjs-extjs: 7.0.0-1
libknet1: 1.24-pve2
libproxmox-acme-perl: 1.4.3
libproxmox-backup-qemu0: 1.3.1-1
libpve-access-control: 7.3-1
libpve-apiclient-perl: 3.2-1
libpve-common-perl: 7.3-1
libpve-guest-common-perl: 4.2-3
libpve-http-server-perl: 4.1-5
libpve-storage-perl: 7.3-1
libspice-server1: 0.14.3-2.1
lvm2: 2.03.11-2.1
lxc-pve: 5.0.0-3
lxcfs: 4.0.12-pve1
novnc-pve: 1.3.0-3
openvswitch-switch: 2.15.0+ds1-2+deb11u1
proxmox-backup-client: 2.3.2-1
proxmox-backup-file-restore: 2.3.2-1
proxmox-mini-journalreader: 1.3-1
proxmox-widget-toolkit: 3.5.3
pve-cluster: 7.3-1
pve-container: 4.4-2
pve-docs: 7.3-1
pve-edk2-firmware: 3.20220526-1
pve-firewall: 4.2-7
pve-firmware: 3.6-2
pve-ha-manager: 3.5.1
pve-i18n: 2.8-1
pve-qemu-kvm: 7.1.0-4
pve-xtermjs: 4.16.0-1
qemu-server: 7.3-2
smartmontools: 7.2-pve3
spiceterm: 3.2-2
swtpm: 0.8.0~bpo11+2
vncterm: 1.7-1
zfsutils-linux: 2.1.7-pve2
greggitter commented 1 year ago

Matches mine.

xeroiv commented 1 year ago

So as a work around I setup crontab to restart the services every 2 hours. Id still be nice to know what is causing the memory leak though.

xeroiv commented 1 year ago

So I gave up troubleshooting the memory leak when running natively. I made a container that runs docker to test out and it doesn't appear to have a memory leak, however I am running into an issue starting two instances. Is there an environmental variable that I can pass docker to change the listen port of one of the containers? Changing the port forwarding doesn't do much unless the infinitude daemon is listening on the new port.

Edit: In messing around with the docker compose to try and get a second instance working I found that if I create a new CT on proxmox and follow the docker compose procedure (rather than the one from the raspberry pi wiki) I end up with a working daemon that doesn't have a memory leak.

greggitter commented 1 year ago

Following up, I am not seeing any memory leaks here, in fact no additional ram consumed at all. Here's where it stands now:

69826 root 20 0 79896 72952 7720 S 0.0 7.0 2:35.33 perl

Not familiar with perl at all, but I used native debian repo sources for all my perl libraries. I know you can install the libraries through a perl package manager and I have no idea if those would be the same versions.

Regarding multiple instances of a docker container, I did find this. Sadly, not an expert at docker...barely a beginner, actually.

jftaylorMn commented 1 year ago

@xeroiv there are references here that describe how to map ports in docker and specify a port when starting infinitude. There is also a home assistant discussion which shows how to have two instances of infinitude running (port 3000 and 3001 in that case). That might be relevant as well. Like @greggitter, my experience with infinitude is limited and nonexistent for Proxmox. I've done a bit more with docker, but stray only a little from the well beaten path. As long as it just works, everything is good.

xeroiv commented 1 year ago

I actually found the culprit in the growing memory usage of perl. I have my Loxone miniserver poll the data every 10 seconds and that is corresponding to the interval in which the memory usage on perl is growing at. Any ideas on how to begin to troubleshoot this? My first thought is that the loxone miniserver isn't closing the connections it makes but I am unsure how to test that.

Here are what the requests and response look like from the miniserver perspective:

7   12:44:19.209    Miniserver  Upstairs Status Request 192.168.30.216:3001 /api/status/1   GET /api/status/1 HTTP/1.1\r\nHost: 192.168.30.216:3001\r\nUser-Agent: [en]\r\nContent-Type: text/html; charset=utf-8\r\nConnection: close
8   12:44:19.209    Miniserver  Upstairs Status Response    192.168.30.216:3001 /api/status/1   HTTP/1.1 200 OK\r\nContent-Type: application/json;charset=UTF-8\r\nServer: Mojolicious (Perl)\r\nContent-Length: 278\r\nDate: Sun, 08 Jan 2023 17:44:18 GMT
greggitter commented 1 year ago

Above my pay grade, but still odd just updating proxmox caused this (?).

xeroiv commented 1 year ago

I agree, it could have been that the old version of proxmox OOM killer was working better than this one and masked the issue. So either way it looks like back to a cron job to restart the service every 2 hours unless @nebulous has some other tests that I can perform to see why the Loxone http get requests are causing perl to grow in memory usage.

github-actions[bot] commented 9 months ago

Stale issue message