thin-edge / thin-edge.io

The open edge framework for lightweight IoT devices
https://thin-edge.io
Apache License 2.0
221 stars 54 forks source link

tedge_agent replies to c8y_SoftwareUpdate for child devices #1193

Closed toewsar closed 1 year ago

toewsar commented 2 years ago

Describe the bug The tedge_agent (or mapper) replies to a software update for it's children and handles it as operations for thin-edge itself. So the request looks like: c8y/s/ds 528,$childId,software_a,version_a,url_a,install

But the answer is not for the child but for the parent: c8y/s/us 501,c8y_SoftwareUpdate

If the agent wants to reply on this operation it should send it for the childId: c8y/s/us/$childId 501,c8y_SoftwareUpdate

To Reproduce

  1. Create Child device
  2. Add c8y_SoftwareList,c8y_SoftwareUpdate to supported operations
  3. Do Software update on child device

Expected behavior I would expect that thin-edge ignores any operation which is not addressed to itself.

Environment (please complete the following information):

rina23q commented 2 years ago

Apparently, our current software management doesn't add the child device ID to the topic c8y/s/us. However, I come up with a quick solution. (needs to be verified though)

  1. If $childId is equal to device.id (CN of certificate), use c8y/s/us.
  2. If not, use c8y/s/us/$childId
toewsar commented 2 years ago

But step 2 should be only done, if the agent can handle software updates for childs. Otherwise it should ignore the opertations.

cstoidner commented 2 years ago

Apparently, our current software management doesn't add the child device ID to the topic c8y/s/us. However, I come up with a quick solution. (needs to be verified though) [...]

Hi @rina23q ,

I feel the aim of the issue was not clear. Let me try to highlight it:

The behaviour of tedge_agent's software management is as below:

1) when a software update request from C8Y arrives; the payload of that request contains the device-id of the device that request is meant for 2) the tedge_agent picks-up that software-update request, even it is meant for a child-device
Example payload: c8y/s/ds 528,$childId,software_a,version_a,url_a,install

3) the tedge_agent tries to reply to that operation, even it does not handle that child-device (or as of now any child-device at all)

The point of that issue is, that picking up and answering that operation by the tedge_agent, blocks any other software component that maybe wants to handle that operation.

And that is what @toewsar is about to do: Implementing an own software-component to handle software management operations for their child-devices. That is blocked, due to point (2) and (3) from above.

We should now decide what is the right solution for the tedge_agent, when a software operation request arrives, that is addressed to a child-device. Should and could that request ignored by the tedge_agent?

By the way: That is probably not just true for software management, but also for other operations handled by thin-edge components.

rina23q commented 2 years ago

Comment updated on 30 of September

What thin-edge need to change from the technical point of view: If $childId not equals to the thin-edge's device.id (CN of certificate), thin-edge components will ignore such SmartREST operations.

That means, thin-edge components will not response to the operations for child devices and let them pending state, so that the other plugins can consume such operations.

This is a temporary fix until thin-edge components have the better child device supports for the existing supported c8y operations.

toewsar commented 2 years ago

Are you sure? Did you change anything in the last time? I testet it on thin-edge v0.7.4, but some component of thin-edge is responding to the software-update operation:

$ tedge -V
tedge 0.7.4
$ mosquitto_sub -t '#' -v
tedge/health/mosquitto-c8y-bridge 1
c8y/s/ds 528,590342_581481,ExampleCloud,0.1,https://t74262079.emea.cumulocity.com/inventory/binaries/573049,install
c8y/s/uat (null)
tedge/commands/req/software/update {"id":"yNKTl_gVkdQhxe7O885is","updateList":[{"type":"default","modules":[{"name":"ExampleCloud","version":"0.1","url":"https://t74262079.emea.cumulocity.com/inventory/binaries/573049","auth":{"bearer":"$JWT"},"action":"install"}]}]}
c8y/s/dat 71,$JWT
tedge/commands/res/software/update {"id":"yNKTl_gVkdQhxe7O885is","status":"executing"}
c8y/s/us 501,c8y_SoftwareUpdate

tedge/commands/res/software/update {"id":"yNKTl_gVkdQhxe7O885is","status":"failed","reason":"1 error, see device log file /var/log/tedge/agent/software-update-2022-09-20T12:46:34.012958099Z.log","currentSoftwareList":[{"type":"ap","modules":[{"name":"CODESYS","version":"3.5.16.30-38"},{"name":"thin-edge","version":"ce5a3749_06099ee5"}]}],"failures":[{"type":"default","modules":[]}]}
c8y/s/us 502,c8y_SoftwareUpdate,"1 error, see device log file /var/log/tedge/agent/software-update-2022-09-20T12:46:34.012958099Z.log"

I agree the operation stays pending, because the response of thin-edge is not correct: It receives the operation c8y/s/ds 528,590342_581481,ExampleCloud,0.1,https://t74262079.emea.cumulocity.com/inventory/binaries/573049,install but responds: c8y/s/us 501,c8y_SoftwareUpdate followed by c8y/s/us 502,c8y_SoftwareUpdate,"1 error, see device log file /var/log/tedge/agent/software-update-2022-09-20T12:46:34.012958099Z.log".

It does not disturb the software update of the child but it's still a bug.

toewsar commented 2 years ago

Hi, I noticed, that a c8y_Restart on a child will reboot the device 👎 . And the bad thing is, that the device starts, and tells c8y that the opertition for "device.id" is finished, but not for the "child.id". That means the operation remains pending and the device will reboot again and again...

rina23q commented 2 years ago

Are you sure? Did you change anything in the last time?

No, we haven't changed yet. thin-edge components will ignore unsupported child device operations created by cloud. As of now, it is addressed in a not complete way as @toewsar you said. This is indeed a problem.

rina23q commented 2 years ago

Hi, I noticed, that a c8y_Restart on a child will reboot the device -1 . And the bad thing is, that the device starts, and tells c8y that the opertition for "device.id" is finished, but not for the "child.id". That means the operation remains pending and the device will reboot again and again...

For all operations which we have implemented so far don't consider child device cases (except the on-going configuration management plugin). This means, before having the support for child devices on thin-edge components, thin-edge components should ignore all operations created for child devices. Then, in the user side, these operations for child devices can be consumed by their own custom implementation.

rina23q commented 2 years ago

Opened a pull request #1514. The fix will ignore c8y_SoftwareUpdate, c8y_Restart, and custom operations defined by custom template that are created for child devices.

I have a question. Should we also ignore the operation for child device in c8y_log_plugin?

toewsar commented 2 years ago

I would say yes.

  1. It's confusing
  2. If the child want's to privide it's own logs
rina23q commented 2 years ago

I would say yes.

  1. It's confusing
  2. If the child want's to privide it's own logs

Thank you. Applied the fix on c8y_log_plugin as well.

rina23q commented 2 years ago

Test steps for tedge_mapper

  1. DTU is connected to c8y. tedge-mapper-c8y service is active.
  2. Subscribe c8y/s/us topic.
  3. Run those commands to publish messages. Assume your thin-edge device ID is NOT DeviceSerial.
    tedge mqtt pub c8y/s/ds "528,DeviceSerial,software_a,version_a,url_a,install"
    tedge mqtt pub c8y/s/ds "510,DeviceSerial"
  4. Nothing should be received on c8y/s/us.

Test steps for c8y_plugin_plugin

  1. DTU is connected to c8y. tedge-log-plugin service is active.
  2. Subscribe c8y/s/us topic.
  3. Run those commands to publish messages. Assume your thin-edge device ID is not DeviceSerial.
    tedge mqtt pub c8y/s/ds "522,DeviceSerial,logfileA,2013-06-22T17:03:14.000+02:00,2013-06-22T18:03:14.000+02:00,ERROR,1000"
  4. Nothing should be received on c8y/s/us.
rina23q commented 2 years ago

Resolved by #1514. The build is #893

rina23q commented 1 year ago

We delivered the fix in 0.8.0. @toewsar Can we close this issue?

toewsar commented 1 year ago

Yes