nebulous / infinitude

Open control of Carrier/Bryant thermostats
MIT License
225 stars 50 forks source link

Unable to use Carrier Home app with Infinitude on Raspberry Pi Docker container #121

Closed MallocArray closed 2 years ago

MallocArray commented 3 years ago

I've got Infinitude running as a docker container on a Raspberry Pi 4 and set the proxy on my SYSTXCCITC01-B thermostat. I'm able to see the data coming into the Infinitude page and can integrate it with Home Assistant, but my android app Carrier Home reports that "This system has not connected to Wi-Fi recently" unless I disable the proxy.

Is this an expected result of using Infinitude or should it pass the same data along to Carrier? Originally I thought it may be related to me changing the port to 3001 since I had Grafana using 3000, but I swapped ports so Infinitude is running on the default port 3000 now and the app is still unable to communicate.

I'm using the following command on the RPI4 to start it: docker run --rm -v $PWD/state:/infinitude/state -e APP_SECRET='123456zxcv' -e PASS_REQS='1020' -e MODE='Production' -p 3000:3000 nebulous/infinitude

dulitz commented 3 years ago

/etc/ssl/certs is very different between the two containers. The working version has a bunch of .0 files, where the basename (before the .0) is a certificate hash. That's how OpenSSL finds a cert: it takes the issuer in the server certificate, hashes it, and opens the file /etc/ssl/certs/[hash].0. If it doesn't find that file it will reject the TLS negotiation.

This difference would cause TLS to succeed on the working container but fail on the latest container. We can hypothesize that if you run c_rehash in the latest container, it will begin to work. @MallocArray can you give it a try? Try running c_rehash with no arguments and if the .0 files don't appear, try giving it /etc/ssl/certs as its argument.

Now this should not be the problem. Rejected TLS does not cause a 404 error. If Mojolicious is reporting rejected TLS as a 404 please don't introduce me to their developers because I will want to hurt the person responsible.

There are other differences in the ls -lRa output and I just want to explain them briefly in case any of us run into a similar issue again.

The number of hard links (that's the decimal number after the rwxr-xr-x mode string) is being reported differently between the two containers: the working version reports nlink==1 for a lot of directories (and some files) that have nlink > 1 in the latest version. I believe this is because Docker's overlayfs reports nlink==1 for the directories/files it has overlayed -- since it's an overlay it doesn't know how many hard links there are (between the different layers) so it just reports 1. This shouldn't be causing any behavioral differences, but it is a reminder that containers that ended up the same but took different paths getting there can have detectable differences.

A lot of directories have mtime differences even though their contents look identical. As one of many examples, /usr/lib/arm-linux-gnueabihf/perl-base/unicore/lib/InSC has a different mtime. Again, this is likely due to overlayfs, where a file was created in that directory (in the overlay) and then deleted (in the overlay), leaving the overlay present but empty. This looks benign. I don't know why files may have been created though; perhaps apt/dpkg is responsible but I haven't investigated.

MallocArray commented 3 years ago

As described, after starting the latest container, I did not see any .0 files in the /etc/ssl/certs folder. I ran c_rehash and then I did see several files in the folder

root@infinitude:/etc/ssl/certs# c_rehash
Doing /usr/lib/ssl/certs
WARNING: Skipping duplicate certificate ca-certificates.crt
WARNING: Skipping duplicate certificate ca-certificates.crt
root@infinitude:/etc/ssl/certs# ls -laR
.:
total 664
drwxr-xr-x 1 root root  12288 Aug 27 01:19  .
drwxr-xr-x 1 root root   4096 Aug 10 14:27  ..
lrwxrwxrwx 1 root root     26 Aug 27 01:19  00673b5b.0 -> thawte_Primary_Root_CA.pem
lrwxrwxrwx 1 root root     45 Aug 27 01:19  02265526.0 -> Entrust_Root_Certification_Authority_-_G2.pem
lrwxrwxrwx 1 root root     36 Aug 27 01:19  03179a64.0 -> Staat_der_Nederlanden_EV_Root_CA.pem
lrwxrwxrwx 1 root root     41 Aug 27 01:19  04f60c28.0 -> USERTrust_ECC_Certification_Authority.pem
lrwxrwxrwx 1 root root     27 Aug 27 01:19  062cdee6.0 -> GlobalSign_Root_CA_-_R3.pem
lrwxrwxrwx 1 root root     25 Aug 27 01:19  064e0aa9.0 -> QuoVadis_Root_CA_2_G3.pem
lrwxrwxrwx 1 root root     50 Aug 27 01:19  06dc52d5.0 -> SSL.com_EV_Root_Certification_Authority_RSA_R2.pem
...

Unfortunately, after deleting all of the files in the state folder, they recreated still with the 404 error.

I then did 'docker commit' to save the updated container as a new image and restarted the container with this rehashed image and it looks like it is working! The res- files are all showing 200 OK

dulitz commented 3 years ago

Yay!

Now you can move forward on two fronts:

Oh and ask the Mojolicious people to fix their [deleted]. There's a reason why Perl is not so popular for web related applications...

MallocArray commented 3 years ago

I'm working through "Learn Docker in a Month of Lunches" and thought I would try going through the Dockerfile line by line to try to find out where things fall apart. Ultimately, it fell apart right off the bat, which may shed some light to our issue.

I started with just running a container based on debian:latest which the Dockerfile uses as the base. Right away, I had issues with the next line, apt-get update as it was failing with

W: GPG error: http://ports.ubuntu.com/ubuntu-ports focal InRelease: At least one invalid signature was encountered.
E: The repository 'http://ports.ubuntu.com/ubuntu-ports focal InRelease' is not signed.
N: Updating from such a repository can't be done securely, and is therefore disabled by default.
N: See apt-secure(8) manpage for repository creation and user configuration details.

After searching, I found: https://github.com/debuerreotype/docker-debian-artifacts/issues/122 which mentions the date not being correct in the container, and sure enough, if I run the date command in this container, it comes back with

Thu Jan  1 00:00:00 UTC 1970

which would invalidate any certificates and prevents doing apt update.

From another source: https://docs.linuxserver.io/faq#libseccomp it looks like upgrading libseccomp2 to 2.4.2 or higher will resolve this, but Raspberry Pi OS currently only includes 2.3.3-4 as the most current, so have to manually download/install or do backports. At this time I don't want to risk changes to my running Raspberry Pi to see if this fixes it, as I have other containers that are working, but if this only impacts the Raspberry Pi and the infinitude container is only behaving unexpectedly on the Pi, then they could be related.

Eventually the date must get updated since I was able to do apt update in the Infinitude container to upgrade ca-certificates, so it could be a red herring, or maybe it is all part of the same underlying issue.

scyto commented 2 years ago

sorry been away for a while, i found a similar issue with another container on a pi caused by libseccomp2 on the host causing issues with containers and it made me think of this - seems like you got there yourself. in my case i couldn't install a bunch of things using apt and would get secirty errors. its because the container actually hands off to the host OS for these security functions (so much for container isolation / portability).

the only solution i know of is

  1. update libseccomp2 on the pi - this is what i have had to do for the other containers that exhibit werid shit
  2. I fix the container at some older variation of the base image (aka a named version tag)

tl;dr this is a pi os issue not a container issue. if we can idetfiy a base container OS version that works i can revert the build to that.

did you managed to fix this on yours?

MallocArray commented 2 years ago

I used the following instructions to add backports for libseccomp2 on my Pi 4 running Buster which did update it to 2.5.1 which resolved my docker build issue, but did not resolve the issues with the standard infinitude container image https://blog.samcater.com/fix-workaround-rpi4-docker-libseccomp2-docker-20/

sudo apt-key adv --keyserver keyserver.ubuntu.com --recv-keys 04EE7237B7D453EC 648ACFD622F3D138
echo 'deb http://httpredir.debian.org/debian buster-backports main contrib non-free' | sudo tee -a /etc/apt/sources.list.d/debian-backports.list
apt update
apt install libseccomp2 -t buster-backports

Since I could start building correctly, I did add this line to the end of the Dockerfile so it builds the latest image with the fix in place

# Fix for Raspberry Pi 404 errors
RUN c_rehash

Alternatively, for any others who are reading this and that seems like too much work, you can run this single command after you have pulled down the container image to fix that instance of it. It will run the 'c_rehash' command inside of the container which fixes the ssl certs You may need to restart the container, or in one of my cases, I had to also restart my furnace/thermostat to get it to start talking to infinitude

docker exec -it infinitude c_rehash
scyto commented 2 years ago
docker exec -it infinitude c_rehash

this sounds like yet another issue, i didn't have do that on my pi4 when i tested

all i had to do was update the libsecomp on the host, not all the stuff you did....

I just did this IIRC (i took whatever was latest version at time i did install)

http://ftp.debian.org/debian/pool/main/libs/libseccomp/libseccomp2_2.5.1-1_armhf.deb
sudo dpkg -i libseccomp2_2.5.1-1_armhf.deb

i think the mistake is installing from backports...

MallocArray commented 2 years ago

The docker exec command just runs c_rehash inside of the container which was fixing my original issue even with the old version of libsecomp2, and a better solution than uninstall/reinstall of ca_certificates I was doing originally.

The backports got me basically the same version you installed manually, but if newer versions come out, I think you would have to find that out yourself and manually install it, where adding the backports should allow apt update to find any newer versions, unless I'm misunderstanding the process.

libseccomp2/buster-backports,now 2.5.1-1~bpo10+1 armhf [installed]
scyto commented 2 years ago

where adding the backports should allow apt update to find any newer versions, unless I'm misunderstanding the process.

except when backports doesn't have it in - which is what happened to me, not sure how long it took, but it was weeks or months difference... tbh this is a sucky situation all-around - so much for docker portability

scyto commented 2 years ago

@MallocArray is this working for you now? I realize now that we got this repro building its own images (so me updating mine doesn't really help unless you pull scyto/infinitude...

@nebulous do the automated docker image builds still work or did you disable / go back to manual?

MallocArray commented 2 years ago

@scyto I pulled the latest image yesterday and things appear to be working without modification on the same Raspberry Pi 4 I've been using, so I think we are good at this point.