Azure / azure-iot-hub-python

Azure IoT Hub Data Plane Python SDK
MIT License
15 stars 11 forks source link

Reauth issue with more than 4 running containers #9

Open gedemagt opened 1 year ago

gedemagt commented 1 year ago

I have this strange issue, which I only encounter when I have a certain number of running processes.

The setup is VMware ESXi virtual machine running Debian 11 with Docker on top. I am now running devices in docker containers, which basically contains a Azure IoT Device client per container together with a bunch of logic. Three of these containers works fine, but when I startup the fourth container, it connects fine, but after 1 hour when it tries to re-auth it fails (only happens on the last started container):

22-11-22 10:30:02[ERROR] Exception caught in background thread.  Unable to handle.
22-11-22 10:30:02[ERROR] ["azure.iot.device.common.pipeline.pipeline_exceptions.OperationCancelled: OperationCancelled('Transport timeout on connection operation')\n"]
Exception in thread Thread-410:
Traceback (most recent call last):
  File "/usr/local/lib/python3.9/threading.py", line 980, in _bootstrap_inner
    self.run()
  File "/usr/local/lib/python3.9/threading.py", line 1306, in run
    self.function(*self.args, **self.kwargs)
  File "/usr/local/lib/python3.9/site-packages/azure/iot/device/common/pipeline/pipeline_stages_base.py", line 466, in retry_reauthorize
    this._reauthorize()
  File "/usr/local/lib/python3.9/site-packages/azure/iot/device/common/pipeline/pipeline_thread.py", line 192, in wrapper
    assert (
AssertionError:
            Function _reauthorize is not running inside pipeline thread.
            It should be. You should use invoke_on_pipeline_thread(_nowait) to enter the
            pipeline thread before calling this function.  If you're hitting this from
            inside a test function, you may need to add the fake_pipeline_thread fixture to
            your test.  (generally applied on the global pytestmark in a module)
22-11-22 10:31:02[DEBUG] OperationCancelled('Could not complete operation') caused by OperationCancelled('Transport timeout on connection operation')

I am running: