telefonicaid / iotagent-ul

IoT Agent for a UltraLight 2.0 based protocol (with HTTP, MQTT and AMQP transports)
https://fiware-iotagent-ul.rtfd.io/
GNU Affero General Public License v3.0
37 stars 55 forks source link

Error: socket hang up issue #383

Open ilyasdresden opened 5 years ago

ilyasdresden commented 5 years ago

Hello,

We faced with issue: Error: socket hang up.

Here was our setup: MQTT (mosquitto) ->IoT Agent UL (FIWARE) -> Orion (FIWARE) Then we pre-registered 1000 devices and started to send message over MQTT every 10 seconds with values for all devises. At some point it stopped processing incoming data and showed this message in the log of IoT Agent UL. We did the same with 100 devices it works well.

What can be the reason for that? Is there any solution for that?

Thank you.

fgalan commented 5 years ago

At some point it stopped processing incoming data and showed this message in the log of IoT Agent UL

Thus, trasnferring this issue to IOTA-UL repository

ilyasdresden commented 5 years ago

@fgalan the problem is between IoT Agent and Orion. MQTT gets everything and transfers to IoT Agent. But IoT Agent tried to open a socket to Orion and it is not working. Could it be that Orion could not handle that? What is the capability of Orion Context Broker? How many incoming requests it can handle? Thank you!

fgalan commented 4 years ago

It seems to be some kind of problem related with scarcity of resources.

Are all the systems (MQTT broker, IOTA UL and Orion) running in the same system? What about the free file descriptor? Can you monitor how many free file descriptors are in the operating system before starting sending measures and how they progress (i.e. free file descriptors in the operating system and file descriptors consumed by IOTA UL and Orion) during the test?

(Not sure the exact command to do that, but I think lsof should provide that information)

AlvaroVega commented 4 years ago

It would be a good idea to test this issue with latest version of this agent, that includes mqtt improvements.

vijapandey commented 1 year ago

Which version you had used to resolve the issue???

fgalan commented 1 year ago

Which version you had used to resolve the issue???

Not sure what you mean... our recomendation is always to use the newest version at the moment of the test. In this case, the newest version for IOTA-UL (at the present moment) is 1.24.0

fgalan commented 1 year ago

Anyway, as I explain in my previous comment I think this is not actually a problem in the IOTA-UL software, but some problem in the environment where it runs due to scarcity of resources or connectivity problems.

pratappulugoru commented 1 year ago

@fgalan we are still seeing this issue even if POD's are running with good number of resources. Our memory and CPU Utilization of POD's is very less. Can you please guide on what may be the issue.

fgalan commented 1 year ago

To help on this, we would need a deterministic way of reproducing the problem. If you could provide that information, we could have a look.

pratappulugoru commented 1 year ago

@fgalan its very simple, please find the fiware stack configuration below in EKS services iotagent-node-lib - 2.18.0 version. iotagen t-json 1.20.0 version. Orion - 3.4.0 version. EKS Pod System Configuration: Orion memory : 0.5 Gi Pubsub pod count : Min -1 and Max -10 IOT Agent pod count : 1 Simulator Configuration: And then send the Telemetry messages using any simulator like Jmeter or python program Number of Devices registered - 840 Msgs/min - 1900 Duration - 2 hrs

With the above configuration, we can see the same behaviour

pratappulugoru commented 1 year ago

@fgalan any update on this. I have recorded the same issue in Orion also. Please see the below link https://github.com/telefonicaid/fiware-orion/issues/4429 Any help is highly appreciated

pratappulugoru commented 1 year ago

@fgalan as requested , I have attached all our deployment configuration files, Request you to go thorough and let me know if any changes required to solve Socket errors issue . iot-agent.txt mongodb-mask.txt orion-qa-masked.txt

mapedraza commented 1 year ago

Could you please share a step by step guide to reproduce it? This means:

  1. A docker-compose.yml file containing all the services deployed or involved in your test case, with all your enviroment variables and config files related with those services
  2. The script/service/tool to send the requests
  3. Curl request to provision to create the entities on the context broker and/or device group or devices provision for the IoT Agent

If you do not provide this, it would be impossible for us to reproduce the issue and give you support.

Thanks in advance

pratappulugoru commented 1 year ago

@mapedraza I have attached all the yaml files above. Hope you have seen those .txt files. If you have not see ,please go through and let me know if anything is required. I would like you to suggest on the environmental variables used in Orion Deployment. We are seeing this issues on High Load. Do i need to change any environmental variables for the High loads and Multi service usecase. Can you please check and guide on that

pratappulugoru commented 1 year ago

@mapedraza Please find the details requested by you 1) A docker-compose.yml file containing all the services deployed or involved in your test case, with all your enviroment variables and config files related with those services Ans) I have attached all the configuration files in the above messages.

2) The script/service/tool to send the requests Ans) You can use any Jmeter tool and any pyhton program to simulate.

3) Curl request to provision to create the entities on the context broker and/or device group or devices provision for the IoT Agent Ans) IoTAgent Curl command

curl --location --request POST 'http://<iotagent-mgr-nlb>:8082/iot/devices' \

--header 'Content-Type: application/json' \

--header 'fiware-service: <service1>' \

--header 'fiware-servicepath: <servicepath1>' \

--data-raw '{

    "devices": [

        {

            "protocol": "standard",

            "entity_type": "xxx",

            "device_id": "yyyyyyy",

            "entity_name": "urn:ngsi-ld:xxx:yyyyyyy",

            "attributes": [

                {

                    "name": "abc",

                    "metadata": {

                        "unitText": {

                            "type": "Text"

                        }

                    },

                    "type": "Text",

                    "object_id": "abc"

                },

               ........ 

            ]

        }

    ]

}'

Orion Curl command

curl --location --request POST 'http://<orion-nlb>:1026/v2/entities' \

--header 'Content-Type: application/json' \

--header 'fiware-service: <service1>' \

--header 'fiware-servicepath: <servicepath1>'

--data-raw '{

    "devices": [

        {

            "protocol": "standard",

            "entity_type": "xxxx",

            "device_id": "yyyyyyy",

            "entity_name": "urn:ngsi-ld:xxxx:yyyyyyy",

            "attributes": [

                {

                    "name": "abc",

                    "metadata": {

                        "unitText": {

                            "type": "Text"

                        }

                    },

                    "type": "Text",

                    "object_id": "abc"

                },

                .....

             ]

         }

    ]

}
mapedraza commented 1 year ago

Thank for your answer @pratappulugoru

I see a couple of things from your answer: