Simplify getInputPortSignal and getOutputPortSignal regarding dynamic memory allocations?

traversaro commented 5 years ago

For several reasons, typically involving performance (due to use of system calls and violation of cache locality) but also determinism on some hard real-time OS, a good practice is to avoid, sometimes it is considered good practice to avoid any kind of dynamic memory allocation in code that will be called in "tight loop", for example the code that is called at every step of a control, simulation or optimization algorithm.

Related to that, I notice that the current BlockFactory API induces the users to perform two dynamic memory allocations whenever, from an output method, and input or output signal is accessed. The situation could get worse if eventually tackle https://github.com/robotology/blockfactory/issues/8, as some of the same signal would need to be accessed also from the updateDiscreteState and the BlockFactory equivalent of mdlDerivatives. Note that this two memory allocation are avoided in the "CoderBlockInformation" implementation only because in that case the IO signals are stored internally already as std::shared_ptr<Signal>, but for example are necessary in the SimulinkBlockInformation implementation (and similarly they will probably be present in <something>BlockInformation implementations for which the addresses of IO signals could change at each simulation step).

These two dynamic memory allocations are caused by: 1) Use of pimpl pattern in blockfactory::core::Signal (added in https://github.com/diegoferigo/WB-Toolbox/commit/75c1d36fb2a60e0d721f983ece8c7d0f32761ce0#diff-e32b9f2c904cb22f7a9187e2a76861a3). I guess this change can be relatively safely reverted, if we think it is the case. 2) Use of std::shared_ptr<Signal> to return a signal in getInputPortSignal and getOutputPortSignal , in place of standard copy on the stack, that in most cases will be optimized away due to Return Value Optimization (RVO). This it is a more complicated change, as it was explicitly introduced in https://github.com/robotology/wb-toolbox/pull/126 to ensure that input data is only read and not wrote without introducing a new type ConstSignal or Signal<isConst>.

The change 2) is definitely more complicated (if we want to continue to enfore the const correctnes of input signals), while 1) should be straightforward.

traversaro commented 5 years ago

For solving 2), a good solution may be to wait for when we will adopt C++17, and switch to use std::optional<Signal> and std::optional<const Signal>, to preserve both the const behavior, and the possibility of explicitly returning a null/invalid signal.

diegoferigo commented 5 years ago

Some time ago, the Signal class was rewritten (https://github.com/robotology/wb-toolbox/commit/deed69e93cfafb011678281adaf901d67754cf07) to minimize dynamic allocations. The main trick there was the introduction of a new type of buffer, the CONTIGUOUS_ZEROCOPY, that provided a non-owning access to the buffer.

Today this is the default behavior for both input and output signals, and in Simulink we had to configure input signals as follows:

https://github.com/robotology/blockfactory/blob/39b3d4441957b90e0d2c71e8bde2fe47d0e1b055/sources/Simulink/src/BlockFactory.cpp#L175-L176

otherwise they are treated as NONCONTIGUOUS and a copy of the buffer is performed.

Said this, here are my comments:

Use of pimpl pattern in blockfactory::core::Signal (added in diegoferigo/WB-Toolbox@75c1d36#diff-e32b9f2c904cb22f7a9187e2a76861a3). I guess this change can be relatively safely reverted, if we think it is the case.

In the case of the Simulink engine, this is correct. The Simulink Coder implementation instead does not have this problem since the core::Signal objects are cached. We can either revert the pImpl for the entire core::Signal class, or convert also the Simulink implementation to something similar (the number of signals cannot change during the simulation, so even if the buffer addresses change there should be no problem).

Use of std::shared_ptr to return a signal in getInputPortSignal and getOutputPortSignal , in place of standard copy on the stack, that in most cases will be optimized away due to Return Value Optimization (RVO). This it is a more complicated change, as it was explicitly introduced in robotology/wb-toolbox#126 to ensure that input data is only read and not wrote without introducing a new type ConstSignal or Signal.

Why do you think that there is a copy in this case? I recently forced in #33 to use the move semantic to return input signals, and this was necessary only due to a defect in ISO C++11:

https://github.com/robotology/blockfactory/blob/ee0c696e5abe15676f82e6e72e804046c4d573bc/sources/Simulink/src/SimulinkBlockInformation.cpp#L123

traversaro commented 5 years ago

Why do you think that there is a copy in this case?

I probably did not expressed myself clearly, but I am not sure about what you are referring to. I do not think I ever said there is a copy in this case. I just think that if Signal was a relatively standard class with no pimpl, a simple return will avoid the need of calling make_shared to create an ad hoc shared_ptr just to satisfy the interface.

We can either revert the pImpl for the entire core::Signal class, or convert also the Simulink implementation to something similar (the number of signals cannot change during the simulation, so even if the buffer addresses change there should be no problem).

I strongly suspect that eventually removing the pimpl from core::Signal is the preferable solution, as it reduces the complexity of the Signal class, and do not force to increase every the complexity of each engine BlockInformation's implementation just to avoid memory allocations. The benefits of pimpl in this case (facilitating ABI stability) seems minimal compared with the increased complexity.

diegoferigo commented 5 years ago

I probably did not expressed myself clearly, but I am not sure about what you are referring to. I do not think I ever said there is a copy in this case. I just think that if Signal was a relatively standard class with no pimpl, a simple return will avoid the need of calling make_shared to create an ad hoc shared_ptr just to satisfy the interface.

The benefits of pimpl in this case (facilitating ABI stability) seems minimal compared with the increased complexity.

Sorry, I misunderstood your point if it was on the std::make_unique call. I agree on this, the pimpl does not contain much stuff in this case, it should be a straightforward change.

robotology / blockfactory

Simplify getInputPortSignal and getOutputPortSignal regarding dynamic memory allocations? #27