eProsima / Micro-XRCE-DDS-Client

Micro XRCE-DDS Client repository. Looking for commercial support? Contact info@eprosima.com
Apache License 2.0
134 stars 84 forks source link

Fix POSIX transport profiles #321

Open jseldent opened 2 years ago

jseldent commented 2 years ago

This pull requests adds proper handling of EINTR to the UDP and TCP transport profiles on POSIX platforms.

Some platforms, like the FreeRTOS simulator for Linux, generate a lot signals. This causes blocking I/O functions (like connect(), poll(), send() and recv()) to fail with errno == EINTR. Currently, Micro-XRCE-DDS-Client assumes this means the connection has been lost. Instead, it should restart the I/O operation and try again. This pull request implements the auto-restart behavior.

Additionally, #318 contains a bug in getaddrinfo() (the socket type for TCP should be SOCK_STREAM instead of SOCK_DGRAM). Also, it didn't register a signal handler for SIGPIPE, like the polling TCP transport does. This pull request fixes both of those issues.

I've tested these changes with the FreeRTOS simulator for POSIX and they fix the communication issues we had there.

richiprosima commented 2 years ago

Build status:

jseldent commented 2 years ago

The proposed implementation restarts blocking operations like poll() and recv() (with SO_RCVTIMEO) with the same timeout after an EINTR interrupt. This could lead to uxr_read_udp_data_platform()/uxr_read_tcp_data_platform() waiting for longer than the specified timeout.

This could be prevented by setting the timeout to 0 after EINTR. But that could cause those functions to return early. Which would you prefer?

pablogs9 commented 2 years ago

We need uxr_read_*_data_platform to wait at maximum the specified time in the timeout if no data arrives.

jseldent commented 2 years ago

Ok. I've pushed a commit that sets the timeout to 0 after the first occurrence of an EINTR interrupt.

richiprosima commented 2 years ago

Build status:

pablogs9 commented 2 years ago

Could you explain the purpose of this last commit?

Acuadros95 commented 2 years ago

You can also recalculate the timeout as we do here: https://github.com/eProsima/Micro-XRCE-DDS-Client/blob/160fc8529cc0d44360d4cecda46f2efcb34ac40f/src/c/core/session/session.c#L335-L344

jseldent commented 2 years ago

Let's say the timeout is 1 second. And after half a second a signal is received and poll() (or recv()) returns with errno == EINTR. If we then restart those functions with the same timeout, and no data is available, the uxr_read_*_data_platform() function will take 1.5 seconds. Which exceeds the timeout. By setting the timeout to 0 after the interrupt, the function returns after 0.5 seconds. Which is too soon, but at least not too late.

If you prefer, I can add calls to uxr_millis() to compute the remaining time in case of an interrupt. But since that already happens at the higher layers, I'm not sure how useful it would be to do it also in the low-level I/O functions.

pablogs9 commented 2 years ago

Yes please, make it block the required time.

jseldent commented 2 years ago

I've pushed a fix to compute the remaining time.

richiprosima commented 2 years ago

Build status:

pablogs9 commented 2 years ago

image

jseldent commented 2 years ago

I've included <uxr/client/util/time.h>. That should fix it.

richiprosima commented 2 years ago

Build status:

pablogs9 commented 2 years ago

@Acuadros95 can we test this in integration tests?

Acuadros95 commented 2 years ago

Integration tests running on https://github.com/eProsima/Micro-XRCE-DDS/pull/144

Tests are OK!

jseldent commented 2 years ago

I just pushed another commit to fix the non-polling versions of the UDP and TCP transport profiles. To behave identically to the polling versions, receive timeouts should not be treated as errors.

richiprosima commented 2 years ago

Build status:

richiprosima commented 2 years ago

Build status:

jseldent commented 2 years ago

Build status:

* Linux [![Build Status](https://camo.githubusercontent.com/c15842e2450c554bdb272b8fe354acf9084186bd1edb598bcc3c2dde942d7ae0/687474703a2f2f6a656e6b696e732e6570726f73696d612e636f6d3a383038302f6275696c645374617475732f69636f6e3f6a6f623d4d6963726f2d585243452d4444532d436c69656e742532304d616e75616c2532304c696e7578266275696c643d353636)](http://jenkins.eprosima.com:8080/job/Micro-XRCE-DDS-Client%20Manual%20Linux/566/)

* Windows [![Build Status](https://camo.githubusercontent.com/24b8abbe75f147e4209df59fa6aede6d860c816e12e65bd42cd6bee158f698e0/687474703a2f2f6a656e6b696e732e6570726f73696d612e636f6d3a383038302f6275696c645374617475732f69636f6e3f6a6f623d4d6963726f2d585243452d4444532d436c69656e742532304d616e75616c25323057696e646f7773266275696c643d353834)](http://jenkins.eprosima.com:8080/job/Micro-XRCE-DDS-Client%20Manual%20Windows/584/)

How can I see the error here? I can't find why the build is failing.

Acuadros95 commented 2 years ago

Click the tag and go to the Console Output tag.

In this case uncrustify is failing on the following files:

FAIL: src/c/profile/transport/ip/tcp/tcp_transport_posix_nopoll.c (File size changed from 4050 to 4054)
FAIL: src/c/profile/transport/ip/udp/udp_transport_posix_nopoll.c (File size changed from 3620 to 3624)

You can use this uncrustify.cfg to fix the style: https://raw.githubusercontent.com/eProsima/cpp-style/master/uncrustify.cfg

richiprosima commented 2 years ago

Build status:

jseldent commented 2 years ago

@Acuadros95, thanks. I've fixed the issues.