advancedtelematic / ota-community-edition

End-to-end Over The Air updates
Mozilla Public License 2.0
54 stars 23 forks source link

ota-device-registry UUID behavior mismatch with Aktualizr #91

Closed bclouser closed 5 years ago

bclouser commented 5 years ago

Hello, I'm new here, but hopefully I am not wildly off base.

It seems that this commit in ota-device-registry, specifically this piece:

@@ -61,8 +61,7 @@ object DeviceRepository {
   val devices = TableQuery[DeviceTable]

   def create(ns: Namespace, device: DeviceT)(implicit ec: ExecutionContext): DBIO[DeviceUUID] = {
-    val uuid = device.uuid.getOrElse(DeviceUUID.generate)
-
+    val uuid = DeviceUUID.generate

goes against the current behavior of Aktualizr and the reverse gateway configuration. At least during implicit provisioning, Aktualizr attempts to publish information using the uuid embedded in the ssl certificate that is generated locally via the "new_client" function in scripts/start.sh.

Inside the "new_client" function, the ssl subject DN for the client key is set to a random UUID. This UUID is embedded into all requests going to the gateway based on the following configuration:

gateway.conf: |-
    server {
      error_log  /var/log/nginx/error.log info;
      listen       8443 ssl;
      server_name ota.ce; 
      ssl_certificate     /etc/ssl/gateway/server.chain.pem;
      ssl_certificate_key /etc/ssl/gateway/server.key;
      ssl_verify_client on;
      ssl_verify_depth 10;
      ssl_client_certificate /etc/ssl/gateway/ca.crt;

      if ($ssl_client_s_dn ~ "CN=(.*)$") {
        set $deviceUuid $1;
      }
      if ($ssl_client_s_dn !~ "CN=(.*)$") {
        set $deviceUuid $ssl_client_s_dn;
      }

The end result is the device never being "seen online" because the device registry never processes "DeviceSeen" events because of the uuid mismatch between the locally created version (from new_client) and the device-registry version (DeviceUUID.generate).

My device-registry's log is full of these: W|00:53:45.439|c.a.o.d.daemon.DeviceSeenListener$|Ignore event for missing or deleted device: DeviceSeen(Namespace(default),DeviceId(9f32ec88-64ae-4151-82c7-8f50444ae9c3),2018-12-15T00:53:45Z)

Versions

Aktualizr: 1.0+gitAUTOINC+348822d914

device-registry:
    Container ID:   docker://59c3d9fe7c74aa60be889e646836442392a1d338db34de42cba4f763e201173a
    Image:          advancedtelematic/device-registry:latest
    Image ID:       docker-pullable://advancedtelematic/device-registry@sha256:ff9fdb219d5473473205c30947bf77f007df50280eeeae50b2ccc48
jerrytrieu commented 5 years ago

@bclouser Thanks for the info! We'll look at reverting the breaking change in ota-device-registry.

houcros commented 5 years ago

@bclouser I've reverted that change and the behaviour when creating a new device should be as before. Thanks for the heads-up :)

bclouser commented 5 years ago

@houcros Thanks! I will be taking this for a spin sometime soon

bclouser commented 5 years ago

@houcros have you tested this? I am still seeing the same issue.

device-registry:
    Container ID:   docker://3df51231634df7be0e1b8bc9b9b81326bfdd1aa0dab3533e2178abf32d4ad660
    Image:          advancedtelematic/device-registry:0.2.1-39-g83f16a7
    Image ID:       docker-pullable://advancedtelematic/device-registry@sha256:6bdedd93bea2a7594fa684d4b79ee4c45121c41db6547183b039a5d9e7976c52

Again, in my logs for device-registry i have a bunch of these:

W|20:49:05.635|c.a.o.d.daemon.DeviceSeenListener$|Ignore event for missing or deleted device: 
DeviceSeen(Namespace(default),DeviceId(32963fc6-7d6f-4ae9-b96d-9c4f60b02940),2019-01-10T20:49:05Z)
I|20:49:07.022|c.a.l.m.d.MessageBusListenerActor$|Processed DeviceSeen - DeviceId(32963fc6-7d6f-4ae9-b96d-9c4f60b02940)

Additionally, a random UUID is returned from device-registry/proxy/api/v1/devices PUT request as called from the "new_client" function in start.sh. That same "random" UUID is the one that shows up in the web UI, resulting in the same "device never seen online" bit.

houcros commented 5 years ago

My attempts to create a device with implicit provisioning fail but apparently for different reasons, in the device gateway. For now I can't tell if it's related or a different bug. I'll keep you posted when we make some progress here.

bclouser commented 5 years ago

can you hint at what you were seeing? Did you get different errors? it might help to know as I dig around and try to understand this issue

houcros commented 5 years ago

Hi, sorry for the hiatus, I forgot about this. So I was seeing errors before reaching the device-registry, in the device-gateway. Then I learnt that there is a bug for implicit provisioning in aktualizr, which probably (and hopefully) explains this. The akrualizr guys are working on that. Once that's merged I'll give it another try and let you know :) But of course you could keep an eye on the aktualizr repo as well if you wish. Also, I should mention I was trying the simulated implicit provisioning, so still there might be some mismatch.