Clarification on Documentation: iceoryx-enabled pre-built and zero-copy mode

chengguizi commented 2 years ago

I am quite excited to see eCAL 5.10 is including iceoryx as a possible inter-process communication middleware, which theoretically could achieve zero-copy (through shared memory)!

I have two questions after reading the documentation:

ECAL_LAYER_ICEORYX is by default turned off (link). Then if I am installing the eCAL library using PPA on Linux, am I using a version with iceoryx or not? If not, is there any pre-built version that has iceoryx turned on?
The documentation here explained zero-copy mode, as well as its limitation of blocking. Is this refering to the eCAL's own SHM implementation? Or it refers to the iceoryx version?

For point 2, I would assume, if iceoryx is enabled, there shouldn't be this blocking limiation, am I right? I read a bit of the iceoryx internals from here.

Thanks for maintaining eCAL :)

chengguizi commented 2 years ago

To add on to the thread, I observed two interesting API:

https://github.com/eclipse-ecal/ecal/blob/b4875ce42932db0ba31c9ce45b2f358f192dad3c/ecal/core/include/ecal/ecal_publisher.h#L244

https://github.com/eclipse-ecal/ecal/blob/b4875ce42932db0ba31c9ce45b2f358f192dad3c/ecal/core/include/ecal/ecal_publisher.h#L215

If I understand correctly, these two APIs only apply to builtin shared memory version. How would these two API take effect, if ECAL_LAYER_ICEORYX is turned on?

I am mainly interested in the one-publisher-many-subscribers use case, with large payload (high resolution images).

FlorianReimold commented 2 years ago

Hi HuiminC,

If you are installing from a PPA, you are installing an eCAL version without Iceoryx. eCAL provides its own shared memory layer that is easier to use than iceoryx, e.g. you don't need to know your maximum message size beforehand and don't need an orchestrator.

For using iceoryx you need to compile your own eCAL. The documentation will get you started on how to do that: https://eclipse-ecal.github.io/ecal/development/build_ecal.html

You can also check out this github action, which builds eCAL with iceoryx: https://github.com/eclipse-ecal/ecal/blob/master/.github/workflows/build-ubuntu-iceoryx.yml If you are on Ubuntu 20.04 you can also check if the build artifacts produced by that action fit your needs.
The eCAL Zero Copy mode refers to the eCAL Share Memory Layer. If you are using a pre-built version of eCAL, you can use that feature, as the Shared Memory Layer is the default one. If you are using iceoryx, those parameters don't have any effect. At the moment, the eCAL ZeroCopy mode has 2 limitations:
- Reading / working on the data blocks the publisher, as the shared memory segment is locked. You can easily work around this limitation by using the multi-buffering feature.
- In a many-subscriber environment, currently only 1 subscriber can work on the data at the same time. This may disqualify the feature for your use-case, I guess. We are working on a shared read-write-lock. Hopefully, eCAL 5.11 will get rid of this limitation.

Have you already tried eCAL without ZeroCopy and can provide some performance measurements? I would be interested in those!

Best Regards Florian

chengguizi commented 2 years ago

Hi Florian,

Thanks for the prompt reply!

Yes I am on Ubuntu 20.04, that means I could first try out the action's artifactory https://github.com/eclipse-ecal/ecal/actions/workflows/build-ubuntu-iceoryx.yml . I will also try building from source if it is necessary. Noted on your comment on ease of use. Do you have a minimal working example on how the iceoryx version would work? I am interested in the performance, and I do see the benchmark section that iceoryx-enabled version would have some more performance boost. Would like to know if the extra complexity is worthwhile.
Great to know the planned improvement on eCAL 5.11 . Does this mean, eCAL is aiming to get the same performance (latency) level as iceoryx currently is this?

For performance measurement, yes I would definitely do one. I am currently trying out the combination of eCAL+CapnProto. Would like to compare it with the ROS1 implementation we had, to see if there is considerable improvements. Any suggestions on how you would like the performance to be measured? I would guess it would be the average send-receive latency, and perhaps CPU utilisation?

If I understand correctly, the non-zerocopy version of eCAL with SHM transport, would still be a lot better than ROS1 which is using TCP socket communication, by saving at least two copies (user-to-kernel and then kernel-to-user, reference). With proper selection of serialisation method (e.g. CapnProto) we could also save more on the serialisation copies, at least at the decoding time.

Thanks!

KerstinKeller commented 2 years ago

From my point of view, I would try to get started first (e.g. with the PPA eCAL version) for basic benchmarking. If your setup gets more complex, at some point you might have to compile (at least eCAL runtime) yourself, to match your dependency stack.

To be honest, the iceoryx implementation in eCAL has been done, and benchmarked, but in production scenarios we always work with eCAL's native SHM implementation. We do transfer large amounts of data, but we seldomly observe that eCAL is the performance bottleneck.

I think you will see massive improvements over ROS1 / TCP, just with regular eCAL SHM. The rest will depend highly on your setup:

Do you have 1:1 or 1:N (and how big is N) scenarios?
How are you continuing to process your data? (e.g. how do you partion your nodes? Do they do heavy computation before sending out data again, or are you chaining very light nodes?)

I am curious to see how eCAL works out for you!

FlorianReimold commented 1 year ago

Closing this as the discussion seems to have seized.

eclipse-ecal / ecal

Clarification on Documentation: iceoryx-enabled pre-built and zero-copy mode #794