thin-edge / thin-edge.io

The open edge framework for lightweight IoT devices
https://thin-edge.io
Apache License 2.0
222 stars 55 forks source link

Bad user-error handling when Software download to thin-edge specfies an unzipped folder #1565

Closed rus-sag closed 1 year ago

rus-sag commented 2 years ago

Describe the bug I tried to deploy some software (an Apama project) from my tenant to a thin-edge device. However, I had forgotten to zip up the project first. So, the URL specified (see below) points not a .zip. but to a folder in the git hub repo. My bad.

This bug is about how the tenant UI handles this error. I would expect the operation to fail, with an error, or perhaps the UI would enable the user to cancel the Apply changes operation. But it doesn't, instead the apply software progress bar just keeps indicating that an operation is in progress, but it never completes or fails. And the adjacent cancel button is disabled. AFAICT, the only way to stop the pending operation is to remove the thin-edge device and start all over again. Painful!

To Reproduce

  1. Upload an unzipped apama project to the repo on your tenant (e.g use the URL shown in the operation below)
  2. This seems to work fine (perhaps this should be blocked?)
  3. Now try to deploy it down to thin-edge device.
  4. The operation never completes.

Expected behavior

  1. Operation terminates with an error.
  2. Or, at least allow the user to terminate it via the cancel button.

Screenshots

{
  "creationTime": "2022-11-07T11:59:59.963Z",
  "deviceId": "52134201",
  "deviceName": "rus_tedge_device",
  "self": https://t4044519.latest.stage.c8y.io/devicecontrol/operations/142201,
  "id": "142201",
  "status": "PENDING",
  "description": "Apply software changes: install \"apama-quick-start\" (version: 1.0::apt)",
  "c8y_SoftwareUpdate": [
    {
      "softwareType": "apama",
      "name": "apama-quick-start",
      "action": "install",
      "id": "56141202",
      "version": "1.0::apt",
      "url": "https://github.com/thin-edge/thin-edge.io_examples/tree/main/StreamingAnalytics/src/quickstart/project"
    }
  ]
}

Environment (please complete the following information):

Additional context

reubenmiller commented 1 year ago

Thanks @rus-sag for the details.

Regarding the expected behaviour:

1. Operation terminates with an error.

Yes the agent should definitely not leave an operation in the EXECUTING, so that is definitely a bug that we will address.

2. Or, at least allow the user to terminate it via the cancel button.

This suggestion would actually be for the Cumulocity Device Management application rather than thin-edge. I can understand why the UI does not offer to cancel EXECUTING operations as the UI can not be sure if the device/agent is actively working on it or not, so the UI takes a more defensive approach. That being said, when writing agents, this is fairly common thing to fix. There is usually some unexpected corner case which is not handled by the agent, and it misses updating the operation to FAILED, and this results in the operation being "stuck" in EXECUTING.

In such corner cases where the agent is clearly not processing the operation anymore, I would recommend using the REST API to find the operation for the device, and then set the status to FAILED. This is the same REST API that the UI is using when cancelling a PENDING operation.

There is fortunately unofficial cli tooling like go-c8y-cli (an open source project of mine), which makes the above task easier.

c8y operations list --device 12345 --status EXECUTING | c8y operations update --status FAILED --failureReason "User cancelled operation"
Ruadhri17 commented 1 year ago

I tried to reproduce that bug on version 0.9.0, but everything is working correctly, i.e. when you give a link to the GitHub repo instead of a zipped folder, it will produce an error and return FAILED status. We no longer use the apama plugin, and the sm-plugin checks for debian format archive when processing the file, which might not have been the case previously.