Azure / azure-sdk-for-go

This repository is for active development of the Azure SDK for Go. For consumers of the SDK we recommend visiting our public developer docs at:
https://docs.microsoft.com/azure/developer/go/
MIT License
1.65k stars 847 forks source link

Azure IoT Hub device-2-cloud messages are not received by cloud #4125

Closed v1r7u closed 11 months ago

v1r7u commented 5 years ago

We experience an issue with Azure IoT Hubs created by terraform, that consumes sdk-for-go.

Issue Description

One of Azure IoT Hub features is secure communication channel between devices and cloud, that can be done with direct methods:

  1. cloud side invokes direct methods to send commands or query data from devices
  2. devices receive these requests or commands
  3. devices send device-2-cloud messages
  4. cloud listens to the messages from clause 3.

This flow works fine for IoT Hubs created from Azure Portal or with azure-cli But the clause 4 does not work for Azure IoT Hubs, created by terraform: Events that are sent from devices are not visible on cloud side.

References

  1. I opened an issue in azure-terreform-provider repository: https://github.com/terraform-providers/terraform-provider-azurerm/issues/2919

  2. I created a github repository with samples to reproduce the problem: https://github.com/deepnetworkgmbh/azure-iot-directmethod-sample

  3. We had already two Azure support requests about the issue, but engineer only confirmed that issue does not exist for ARM templates or az cli. So it's likely to be on the side of

    • Terraform Azure provider
    • Azure SDK for go
    • Azure Resource Manager, that processes requests from the go sdk.

Questions

Could you help me to confirm that there is no problem with go sdk itself?

I think, there might be 3 outcomes:

  1. IoT Hub created by go sdk works fine. Then it's very likely to be terraform provider issue
  2. IoT Hub created by go sdk has the same described issue. but code is fine. Then it's likely to be Azure Resource Manager problem
  3. Potential bug in go-sdk codebase
jhendrixMSFT commented 5 years ago

While I don't know anything about IoT Hubs I'll help as best I can. Is it possible to compare what's created by CLI and Terraform to see if there are differences? I suspect the next step is to get Terraform looking at this to confirm/deny if there's a bug in how they create IoT Hubs.

tombuildsstuff commented 5 years ago

@jhendrixMSFT we've not had time to look into this from our side yet, but I'd agree that we probably need to pass some extra fields here/use a different API version to enable this functionality.

Since this is a feature enhancement for Terraform (and isn't yet a feature request/bug in the SDK) - is it worth closing this in favour of https://github.com/terraform-providers/terraform-provider-azurerm/issues/2919 for the moment, until we've had a chance to investigate this further?

v1r7u commented 5 years ago

@tombuildsstuff are you sure that this is enhancement? I haven't seen such property nor in ARM templates, neither azure cli. Maybe I miss something.

Unfortunately, I haven't used go anywhere except a-tour-on-go, so it's a bit tough for me to get through the code and how to use it. @jhendrixMSFT, if you can help me to create a basic IoT Hub instance using a pure go-sdk, we can check if device-2-cloud channel works with it. Maybe, some code-sample that I can execute on my side?

tombuildsstuff commented 5 years ago

@v1r7u

@tombuildsstuff are you sure that this is enhancement? I haven't seen such property nor in ARM templates, neither azure cli. Maybe I miss something.

I don't without further investigation, unfortunately.

The ARM template in your example uses API version 2018-04-01 whereas we use version 2018-12-01-preview - as such it's likely this functionality requires some work to support (for example, perhaps the new API version has this behaviour disabled by default, or requires a separate call to enable) such that this would be an enhancement, rather than a bug - but it requires further identification to confirm either way.

Thanks!

jhendrixMSFT commented 5 years ago

@v1r7u here's a simple Go program to create a basic IoT hub. https://gist.github.com/jhendrixMSFT/55054fb733bf89304b24627024cc3baa It will create an Azure authorizer from your environment so you need to set the following environment variables before running.

AZURE_SUBSCRIPTION_ID
AZURE_CLIENT_ID
AZURE_CLIENT_SECRET
AZURE_TENANT_ID
v1r7u commented 5 years ago

@jhendrixMSFT many thanks! I'll do that during the week

v1r7u commented 5 years ago

@jhendrixMSFT I tried your snippet and resulting IoT Hub works fine with my tests 👍

I also briefly looked at code, that creates IoT Hub in Terraform library. The only suspicion for me part is expandIoTHubEndpoints method. It creates empty slice for event-hub-endpoint.

I'll try to reproduce the same approach with empty event-hub slice/array in

I'll write results eventually :)

v1r7u commented 5 years ago

And BANG!

I reproduced the same behavior for IoT Hub created by ARM template. Commit with a change to my samples-repository is here

To summarize my understanding of the problem.

  1. Azure Resource Manager by default creates IoT Hub with built-in routing of device-2-cloud events. This route is not displayed at portal.
  2. Users might specify additional routing for misc resources: Storage Account, Event hubs, etc. If user sets endpoints properties to empty slices (arrays), Azure Resource Manager creates IoT Hub, that looks exactly like a normal one, but without built-in routing for device-to-cloud events.
  3. The same applies not only to creation of new IoT Hubs, but also to updates of existing.

Now it looks like the problem is on Azure Resource Manager side, which creates or updates IoT Hubs. Do you have any ideas how to send the bug-report to the team?

jhendrixMSFT commented 5 years ago

I will find somebody from the service team to look into this but I suspect this by design, i.e. specifying empty routes is probably how one would express removal of endpoints.

v1r7u commented 5 years ago

The problem with this design, that it also removes route (my favorite device-2-cloud events), that is out of user control. The worst part - this removal is not possible to track from end-user perspective: the route is not displayed not before the action, not after.

lilyjma commented 4 years ago

@v1r7u do you still experience the problem?

v1r7u commented 4 years ago

@lilyjma I came here as a result of azurerm terraform thread: https://github.com/terraform-providers/terraform-provider-azurerm/issues/2919. At the moment, we've already found several workarounds.

Honestly, I do not know what is the current behavior of golang library.

One year ago, I was surprised that ARM templates and golang library behave differently with default values. The entire built-in event-hub endpoint looks like a kind of magic and to operate it you have to get arcane knowledge...

If you're in the process of triaging the issue - now it does not have a huge impact on me. Back in time, it cost me a lot of working hours and it was not the most pleasant moments of my life :) But now I know how to overcome it.

What sits on top of my mind:

Please, let me know if I can further help you.

tadelesh commented 11 months ago

We have retired support for Azure SDK for Golang libraries which do not conform to our current Azure SDK guidelines (see announcement). Please migrate to the latest version according to the migration guide. If you could still repo this problem, please submit a support ticket in Azure directly.