eclipse-iceoryx / iceoryx

Eclipse iceoryx™ - true zero-copy inter-process-communication
https://iceoryx.io
Apache License 2.0
1.68k stars 393 forks source link

Reading RadarObject with a string member crashes in CentOS7 #2176

Closed afpgit closed 9 months ago

afpgit commented 9 months ago

Required information

Operating system: CentoOS7 / Windows 11

Compiler version: GCC 11.2.1, VS2022 Community

Eclipse iceoryx version: v2.0.3

Observed result or behaviour: Subscriber crashes in CentOS7 with segmentation fault. Subscriber works successfully in Windows.

Conditions where it occurred / Performed steps: Just modified the publisher/subscriber examples in the iceoryx/iceoryx_examples/icehello/ folder from:

struct RadarObject
{
    double x = 0.0;
    double y = 0.0;
    double z = 0.0;
};

to:

struct RadarObjectWithString
{
    double x = 0.0;
    double y = 0.0;
    double z = 0.0;
    std::string s{"string"};
};

This std::string addition to the data causes a segmentation fault in CentOS7 when I try to read the string data in the subscriber using takeResult.value()->s. The value of x, y, z are still perfectly accessible in CentOS7 using takeResult.value()->x, ... .

However, the same code works successfully in Windows and I can print the value of s using takeResult.value()->s as well as the rest of the struct members.

elfenpiff commented 9 months ago

@afpgit This is expected. The types you are allowed to use in zero-copy communication must satisfy multiple properties:

  1. Self Contained, they are not allowed to use the heap
  2. They shall not use vtables
  3. They are not allowed to use pointers in their internal structure

The string violates 1. and 3. But no worries, you can use the iox::stringas alternative, then your code would look like:

#include "iox/string.hpp"

struct RadarObjectWithString {
  double x = 0.0;
  double y = 0.0;
  double z = 0.0;
  iox::string<128> s("string"); // creates a fixed size string with the maximum capacity of 128
};

The reason why you have to satisfy strictly the requirements is, that every process has its own local process space that cannot be accessed from within another process. So if your data structure uses the heap, this heap address is not accessible from another process which causes a segfault. If you use vtables those tables are essentially function pointer that are also no longer valid in another process space. And if your construct is using pointers internally, those pointers are invalidated in another process space due to the address randomization. For instance in process A the process space goes from 0x10 ... 0x2020 and in process B from 0xbebe ... 0xcafe so if the pointer in process A is pointing to memory location 0x12 it is valid only for process A, in process B it would point to an invalid memory address.

mossmaurice commented 9 months ago

@afpgit You can find the above info and much more also in the official documentation: https://iceoryx.io/latest/getting-started/overview/#restrictions. A good read is about shared memory which visualizes what @elfenpiff described can be found here: https://github.com/eclipse-iceoryx/iceoryx/blob/v2.0.5/doc/shared-memory-communication.md

elBoberido commented 9 months ago

Closing the issue. elfenpiff and mossmaurice already answered the question.