homieiot / homie-esp8266

💡 ESP8266 framework for Homie, a lightweight MQTT convention for the IoT
http://homieiot.github.io/homie-esp8266
MIT License
1.36k stars 308 forks source link

TLS fingerprint checking? #42

Closed jpmens closed 8 years ago

jpmens commented 8 years ago

As per the JSON documentation I have uploaded a configuration which configures Homie to use TLS (the artist formerly known as SSL :-) and TLS connections are correctly established to a Mosquitto broker if I don't force the listener to tlsv1.2.

I assumed the fingerprint element of the configuration would be to verify the server's certificate against some known good value. However, if I randomly modify the fingerprint on the device, the connection succeeds anyway. Is fingerprint checking actually implemented, or have I simply misunderstood the name of the element?

{
  "name": "Clever name of device",
  "wifi": {
    "ssid": "yyyy",
    "password": "xxxx"
  },
  "mqtt": {
    "host": "192.168.1.130",
    "port": 9993,
    "ssl": true,
    "fingerprint": "99:07:4A:06:E7:EC:34:36:B7:F4:73:6C:CC:C3:49:60:14:F3:85:F3",
    "auth": false
  },
  "ota": {
    "enabled": true,
    "ssl": false,
    "host": "192.168.1.130",
    "port": 80,
    "path": "/ota"
  }
}

(The real broker's CA certificate fingerprint obtained with openssl x509 -in ca.crt -noout -fingerprint has a 2 at the end of the fingerprint instead of a 3.)

jpmens commented 8 years ago

Looking at BootNormal takes me here and depths I wouldn't normally be in... it appears to me as though it ought to work.

marvinroger commented 8 years ago

As you just said, it is implemented and should work... Could you please add a Logger.logln("Checking fingerprint") at line 76? Just to be sure the condition is evaluated.

jpmens commented 8 years ago

The condition is evaluated: I see my logln(), but I also see a bit of garbage:

** Booting in normal mode **
⚙ Stored configuration:
  • Device ID: cffb1be0
  • Boot mode: normal
  • Name: Clever name of device
  • Wi-Fi
  • MQTT
    • Host: 192.168.1.130 
    • Port: 9993
    • Auth: no
    • SSL: yes
    • Fingerprint: 99:07:4A:06:E7:EC:34:36:B7:F4:73:6C:CC:C3:49:60:14:F3:85:F3
  • OTA
    • Enabled: yes
    • Host: 192.168.1.130 
    • Port:     • Path: /ota
    • SSL: no
    ��� Fingerprint: 
⌔ Attempting to connect to Wi-Fi
⌔ Attempting to connect to MQTT
JPJP: checking fingerprint
Sending initial informations
Calling setup function
Sending Wi-Fi signal quality
⌔ Attempting to connect to MQTT
JPJP: checking fingerprint
Sending initial informations

Buffer too small or something?

marvinroger commented 8 years ago

There was an issue with the print config function, but the garbage issue seems to appear when printing value parsed by ArduinoJson, I don't know why...

About the fingerprint issue, I'll investigate when I will have access to my ESP8266s. :)

jpmens commented 8 years ago

If I shorten the name of the device to Dtest, I see the garbage Serial output before Host and before SSL; if I then shorten the value of fingerprint to fofo I don't see any garbage at all (and the connection succeeds).

marvinroger commented 8 years ago

Weird. The buffer is not too small, otherwise the MQTT sending would also send garbage to the $name property. I think the problem is that the buffer is actually too big. If you look at Constants.hpp and ConfigStruct.hpp, you will see the fields are fixed size. Maybe there is, after the \0, some garbage that Serial print. I will try with dynamic allocation.

jpmens commented 8 years ago

I've found the reason, but not the cause. On L76, if I remove the !strcmp(), then the verification is actually attempted and I get a MQTT Certificate mismatch on the console. I have verified that Config.get().mqtt.fingerprint actually does contain a value at that point, so the strcmp() is failing on that for some reason ...

marvinroger commented 8 years ago

That's what I meant when I told you to add a Logger.logln("Checking fingerprint"), I should have said in the condition block, sorry we misunderstood. ;)

I have no idea...

jpmens commented 8 years ago

In spite of the strcmp() issue, I can't get this to work at all, and I'm starting to doubt whether verify has ever worked ... I've tried to descend into the depths of WiFiClientSecure.cpp, in particular into function WiFiClientSecure::verify():

  1. Hardcoding fp in there to the certificate fingerprint string makes no difference; I did this to see if we have any garbage in the values passed in, but it doesn't appear so.
  2. There's too much mention of SHA1 so I created SHA1 digest CA and server certificates; no difference -- verification fails.
  3. Even though the calculated uint8_t fingerprint matches the hex bytes in the string fp fingerprint, the function ssl_match_fingerprint() returns false.
marvinroger commented 8 years ago

Have you tried to enable Arduino for ESP8266 debugging? There is a lot of useful informations outputted. Maybe the problem comes from the domain name and not from the fingerprint?

marvinroger commented 8 years ago

Forget what I just said, you said ssl_match_fingerprint() returns false... I'll try when I get home.

jpmens commented 8 years ago

It's definitely not the domain name. Yet. (FWIW, this matches both the certificate subject CN as well as one of the subjectAltNames, but I am prepared to debug that too, once we get passed the fingerprint.)

marvinroger commented 8 years ago

Alright, so we both agree, it seems to be an issue of esp8266/Arduino, not Homie?

jpmens commented 8 years ago

If I say 'yes', you're going to close this quickly and leave me to suffer alone, so I'll say 'maybe' :laughing: But honestly, I think 'yes'.

marvinroger commented 8 years ago

No no, I won't, but if we fill an issue on the Arduino for ESP8266 issue tracker, I prefer to be sure the problem is not on Homie's side. :wink:

Moreover, there is the garbage issue on Serial.

jpmens commented 8 years ago

The problem is definitely not on Homie's side. I'm now skipping ssl_match_fingerprint() in order to go on with seeing how CN matching works: it doesn't. ssl_get_cert_subject_alt_dnsname() returns NULL and ssl_get_cert_dn() also returns NULL which means that name matching cannot ever succeed and thus, even if the fingerprint were ok, ::verify() would fail anyway.

I think I've reached an impasse.

marvinroger commented 8 years ago

So the issue would come from the Espressif SDK? I think it's time to open an issue on esp8266/Arduino!

jpmens commented 8 years ago

I have not been taking a break, and I have not been having lunch, and no siesta either. ;-)

Instead, I've been reading through a huge load of issues on esp8266/Arduino, and I stumbled over a pointer to this test sketch which, when I run it, produces this output:

> ..........
> WiFi connected
> IP address: 
> 192.168.1.211
> connecting to api.github.com
IN ::verify()
JPM: fp=[CF 05 98 89 CA FF 8E D8 5E 5C E0 C2 E4 F7 E6 C3 C7 50 DD 5C], domain_name=[api.github.com]
::verify(): calculated uint8_t: CF 05 98 89 CA FF 8E D8 5E 5C E0 C2 E4 F7 E6 C3 C7 50 DD 5C 
SAN[0] == *.github.com
> certificate matches
> requesting URL: /repos/esp8266/Arduino/commits/esp8266/status
> request sent
> esp8266/Arduino CI has failed
> reply was:
> ==========
> HTTP/1.1 404 Not Found
> ==========
> closing connection

Lines marked with > are real prints from that sketch; the other lines are from debug prints in WiFiClientSecure.cpp I made earlier. I note:

  1. it basically works
  2. fingerprint verification works
  3. obtaining SANs also works
  4. domain name matching also works.

In other words, I think it is a Homie problem? Is some buffer being overwritten somewhere such that the _ssl structure is getting clobbered?

jpmens commented 8 years ago

If, in that same HTTP test sketch I replace host, port, and fingerprint to match my MQTT server, it also works!

> WiFi connected
> IP address:
> 192.168.1.211
> connecting to tiggr.ww.mens.de
IN ::verify()
JPM: fp=[1610af7d9c0b210bf9d75e3c94ac0aa92aa03b93], domain_name=[tiggr.ww.mens.de]
::verify(): calculated uint8_t: 16 10 AF 7D 9C 0B 21 0B F9 D7 5E 3C 94 AC 0A A9 2A A0 3B 93
SAN[0] == localhost
common_name = [tiggr.ww.mens.de]
> certificate matches

(Obviously the HTTP request proper fails)

Also, changing one octet of the fingerprint causes the fingerprint verification to fail.

marvinroger commented 8 years ago

Thanks for the investigation! The client is only used for PubSubClient, so no there is nothing that overwrite the buffer. I will try to reproduce the issue with a simple sketch, it will be easier to debug.

marvinroger commented 8 years ago

Okay, I got it. The verify() method has to be called right after a connect(). The problem is PubSubClient handles the connection and write things right after. So there is no window to call the verify() method.

jpmens commented 8 years ago

That would mean having to pass a callback for verification to ::connect, wouldn't it? Ouch.

marvinroger commented 8 years ago

Yes, or a new parameter to check in PubSubClient for a fingerprint... Which is not cross platform. So, we're facing an impasse. I'll open an issue in the PubSubClient repo, to discuss about it.

marvinroger commented 8 years ago

But someone managed to make it work, so I am probably saying stupid things.

jpmens commented 8 years ago

FWIW I don't think it has anything to do with TLSv1.1 or higher: tests this morning made no difference. Also, to make quite sure, I tested the HTTP sketch I linked to above against a TLSv1.1 and a TLSv1.2 configured broker without noticeable difference.

marvinroger commented 8 years ago

The problem was what I said:

Okay, I got it. The verify() method has to be called right after a connect(). The problem is PubSubClient handles the connection and write things right after. So there is no window to call the verify() method.

As a workaround, I now connect to the server and check the fingerprint, then disconnect and let PubSubClient connect to the broker. So yes, an attack is possible during this tiny timeframe, but I can't do anything, this has to be fixed in PubSubClient.

jpmens commented 8 years ago

I think this is good enough. As reported in #46 this is working, so I will now close this.