Azure / iot-edge-v1

Azure IoT Edge
http://azure.github.io/iot-edge/
Other
524 stars 258 forks source link

[V1] Gateway not starting after adding 511+ modules #655

Closed AnkitPtl closed 5 years ago

AnkitPtl commented 6 years ago

Environment :

Description :

Logs :

Error: Time:Wed Sep 12 09:23:50 2018 File:C:\agent_work\2\s\iot-edge\core\src\broker.c Func:start_module Line:343 module receive socket create failed Error: Time:Wed Sep 12 09:23:50 2018 File:C:\agent_work\2\s\iot-edge\core\src\broker.c Func:Broker_AddModule Line:514 start_module failed Error: Time:Wed Sep 12 09:23:50 2018 File:C:\agent_work\2\s\iot-edge\core\src\gateway_internal.c Func:gateway_addmodule_internal Line:461 Failed to add module to the gateway's broker. Error: Time:Wed Sep 12 09:26:00 2018 File:C:\agent_work\2\s\iot-edge\core\src\gateway_createfromjson.c Func:Gateway_CreateFromJson Line:78 Failed to create gateway using lower level library. Error: Time:Wed Sep 12 09:26:00 2018 File:C:\agent_work\2\s\src\gw\src\main.c Func:main Line:50 An error occurred while creating the gateway.

damonbarry commented 6 years ago

From the logs, it seems that nanomsg (used by our broker for communications between modules) has run out of sockets, so the code to create a socket for the 512th module fails.

I did some digging, and it looks like nanomsg has a compile-time limit of 512 sockets. Internally, we use one socket to publish messages to all modules, and then one socket per module to receive messages. So the 512th socket is created for the 511th module. I haven’t duplicated your scenario to confirm my theory, but this is likely the cause.

The nanomsg socket limit is configurable at build time, so you could figure out what the real socket limit is on your platforms, then build nanomsg for each platform with the platform-specific limit value. The value is NN_MAX_SOCKETS, and you’d set it as a CMake cache entry on the command line via "-DNN_MAX_SOCKETS=" when you build nanomsg.

damonbarry commented 5 years ago

No activity, closing. If you've tried to configure nanomsg to an appropriate value for your platform and you're still seeing this problem, please reopen. Thanks!