eclipse-iceoryx / iceoryx

Eclipse iceoryx™ - true zero-copy inter-process-communication
https://iceoryx.io
Apache License 2.0
1.57k stars 373 forks source link

zero runtime memory allocation application #2040

Open lucabart97 opened 9 months ago

lucabart97 commented 9 months ago

Required information

Operating system: Ubuntu 20.04.6 LTS

Compiler version: g++ (Ubuntu 9.4.0-1ubuntu1~20.04.2) 9.4.0

Eclipse iceoryx version: v2.0.3 (Blueberry)

Heaptrack version: heaptrack 1.4.80

Description

I have an application in C++ that uses Iceoryx that needs to be zero runtime memory allocation. Checking the memory usage using heaptrack, there seems to be a periodic task inside Iceoryx that allocates a string to send the node's status. There is a way to mitigate it?

Heaptrack flame graph

Selection_002

elfenpiff commented 9 months ago

@lucabart97 Yes, you are right. This is some technical debt from the past and is originating here: iceoryx_dust/include/iceoryx_dust/cxx/serialization.hpp

In the end, all occurrences of std::string has to be replaced with iox::string<1024> (1024 is a lucky guess, but it seems to fit our use case). The easiest approach would be to introduce a type alias in the Serialization class called using SerializedString = iox::string<1024>; and do a search and replace.

@lucabart97 since you unraveled the issue, would you like to have the chance to fix it? We would guide you through the pull request process, and we would be happy to have a new iceoryx contributer :smile:

lucabart97 commented 9 months ago

@elfenpiff Thank you for your quick response. I'll try to fix the ""issue"" myself, and I'll make a pull request once I'm done.

elfenpiff commented 9 months ago

@elfenpiff Thank you for your quick response. I'll try to fix the ""issue"" myself, and I'll make a pull request once I'm done.

@lucabart97 Awesome! Please add iox-#2020 as prefix for every commit (required for traceability) and sign the eclipse ECA (https://www.eclipse.org/legal/ECA.php) then this could be "easy" - if I forgot something, and it turns out to be hard, do not hesitate to ask!

elBoberido commented 9 months ago

@lucabart97 I think it won't be that easy to remove the last std::string from the code base. We need some bigger refactoring for the serialization and Unix Domain Socket abstraction to get rid of std::string.

In your specific case this would not be necessary since there is a more elegant solution which also prevents jitter in the latency when the monitoring is turned on. See also https://github.com/eclipse-iceoryx/iceoryx/discussions/1374. I'm talking about https://github.com/eclipse-iceoryx/iceoryx/issues/1361. There are some proposals but they are also quite intrusive. A stop gap solution would be a simple counter or a 64 bit millisecond timestamp in the shared memory which is set by the application and checked by RouDi.

Getting rid of the std::string in the code base would also be nice but requires more work. There is #260 which already open for a long time. Like mentioned above, the serialization, string conversion and the traditional IPC channels need to be ported. For string conversion we could borrow the number to string conversion from the logger and maybe use iox::into<iox::string>(42) and iox::into<iox::optional<uint8_t>>("42") (or introduce a try_from which returns an iox::expected) instead of the current free functions in convert.hpp. For the serialization there is a proof of concept which I created a few years ago https://github.com/eclipse-iceoryx/iceoryx/compare/master...elBoberido:iceoryx:serde. This could be used as inspiration. It also has a simple Buffer type which could be used for the Unix Domain Socket and the other IPC channels.

... and then there is also #1133 :sweat_smile:

Anyway, whatever you choose, looking forward to your contribution :)

lucabart97 commented 9 months ago

Thank you, @elBoberido for the explanation.

I investigated the code more closely to solve my problem. When using the master version instead of v2.0.3, the "issue" is not present by default because the code has been updated to read m_sendKeepalive from the iox-roudi daemon to avoid sending the message because monitor mode is off by default. command: heaptrack ./iceoryx_examples/icehello/iox-cpp-publisher-helloworld

Heaptrack - heaptrack iox-cpp-publisher-helloworld 1442919 gz — Heaptrack GUI_001

However, when enabling monitor mode, the dynamic memory will appear:

Heaptrack - heaptrack iox-cpp-publisher-helloworld 1453686 gz — Heaptrack GUI_002

Heaptrack - heaptrack iox-cpp-publisher-helloworld 1453686 gz — Heaptrack GUI_003

The "issue" in the master version is in the IpcMessage::addEntry(const T& entry) function, caused by the internal std::string inside the std::stringstream container.

https://github.com/eclipse-iceoryx/iceoryx/blob/4fb0b5f95b620a45d68daddf01942cb445214ed4/iceoryx_posh/include/iceoryx_posh/internal/runtime/ipc_message.inl#L27-L42

So, @elfenpiff @elBoberido if we want to solve this issue, the solution seems to substitute the std::stringstream container. What do you think about it?

elfenpiff commented 9 months ago

@lucabart97 @elBoberido

aaah I remember those lines. To replace the stringstream we require a iox::to_string function that converts a wide variety of T into a iox::string.

I think it should be sufficient if this is implemented for all integer, bool, float, double and char types.

When we have this, we could write iox::string<1234> newEntry and newEntry.append(iox::to_string(entry)).

elBoberido commented 9 months ago

@lucabart97 @elfenpiff the IpcMessage is only the first link in a chain. The next link is IpcChannel::send which expects a std::string. The IpcChannel is a typedef for UDS on most platforms and NamedPipe on Windows. These would also need to be changed. Then there is IpcChannel::receive and also RouDi::processMessage which uses cxx::Serialization which uses std::string.

After all of this is done, there is still the issue with the jitter when monitoring is turned on and additionally all the other IPC messages from the runtime need to be ported away from std::string. Which we eventually want anyway but it adds up to the required work.

Saying that I would propose to work from bottom up instead of top down. This has the benefit to have multiple small PRs which are usually easier to review. My proposal is as follows:

Btw. the faster solution would still be to replace the keep alive message with a simple counter/timestamp in the shared memory. But it is more complicated for someone who does not know the code base and unfortunately we are lacking documentation for this part of the code base. If you need this to be fixed soonish I think I could spend a day to fix it in about one or two weeks and you could take care of the std::string replacement in all the places I mentioned above. What do you think?

lucabart97 commented 9 months ago

@elBoberido

I see, the changes to perform the fix are a lot. I believe the problem was only in std::stringbuffer because the std::string in the GNU libstdc++ has a stack size of 15 characters following the SSO-23 standard.

The iceoryx string type does not have iterator support. I see some std functions that use iterators in the code (e.g., std::count_if). What are your guidelines for using iterators? Do you prefer iterators?

Is it safe to use a fixed-size string length inside the Iceoryx API? If someone needs to send more than the fixed-size dimension, what can they do? What do you think about reserving or preallocating the std::string container at initialization time? And using std::string_view when passing data between functions?

If a simple counter/timestamp is enough for you, I think it is a better solution instead of using a string.

elBoberido commented 9 months ago

@lucabart97 right, the small object optimization. I forgot about that. I does not really solve the problem but hides it quite well. Since the runtime name is also added to the string it would not be sufficient to only replace the std::stringstream.

It is fine to use iterators. If some container do not have them it is just because a lack of time no because we do not like to use them. Adding iterators to the iox::string would also be a good first issue. Saying that, the iox::string would be the wrong abstraction to use for the UnixDomainSocket, in the same way std::string is the wrong type. We artificially restrict what can be sent via UnixDomainSocket or at least make it harder to send binary data. Lets assume you have something like 0x42 0x00 0x13 0x37 0x73. If a string is used as data type, the second character will be interpreted as the end of the string and you have to be careful to really get all the data. A proper solution needs to take care of this and provide something that is based on iox::byte (or std::byte) when we switch to C++17.

For the UnixDomainSocket we would need to stay generic over the size. Something like this would work

template <uint64_t N>
expected<void, IpcChannelError> send(const Buffer<N>& buf) const noexcept;

Preallocating the std::string would be just a workaround and would increase the technical dept. We want to get rid of std::string in our API anyway so the preferred solution for the IPC communication with RouDi would be something like I described above. For the specific problem with the heart beat message, it would be better to completely replace the IPC call since this also causes other problems.

I hope I did not completely scare you away. Like I said before, if this is something pressing for you I could take care of refactoring the heart beat within the next one to two weeks. Since the documentation for this part of the code base is quite sparse and it would be faster for me to fix it myself instead of guiding you through all the pitfalls. It would be great though if you have time to work on the proposals to completely get rid of std::string in the RouDi IPC.

lucabart97 commented 8 months ago

@elBoberido, no it is not a pressing thing for me, I just solved using the master version with the monitor disabled.

I want to contribute given that I like the project and I use Iceoryx on my projects. If in the next weeks, I find times (I hope), I'll start from IpcMessage to replace the std::string with iox::vector<uint8_t,X>, where we need to decide the predefined vector capacity. Keep in mind that some interfaces, for example UnixDomainSocketalready have his predefined max length platform::IOX_UDS_SOCKET_MAX_MESSAGE_SIZE, so we need to decide if we want a global buffer dimension, or perform a copy between buffers with different capacity.

elBoberido commented 8 months ago

@lucabart97 nice to hear :)

If your time is limited I would suggest not to start with the IpcMessage since you would need to immediately change everything else in the same PR since the whole chain down to UnixDomainSocket depends on the std::string. Alternatively, conversions to std::string would be required at many places. Worst case would then be that you don't have time to continue on the refactoring and a half finished state would potentially remain in the code base. We already have a few of them.

Going bottom up would not add technical dept during the refactoring, at least not as much. Having an UnixDomainSocket with an additional API to the std::string one, has benefits by its own, even when the work would be stopped right after that.

The vector capacity does also not need to be predefined since iox::vector<uint8_t, X> would be covariant to iox::vector<uint8_t, platform::IOX_UDS_SOCKET_MAX_MESSAGE_SIZE> for N < platform::IOX_UDS_SOCKET_MAX_MESSAGE_SIZE. This could be checked by a static_assert like in the following example.

template <uint64_t N>
expected<void, IpcChannelError> send(const iox::vector<uint8_t, N>& buf) const noexcept {
    static_assert(N <= platform::IOX_UDS_SOCKET_MAX_MESSAGE_SIZE, "Size exceeds transmission limit!");
}

Using a iox::vector is also not the ideal solution since the iox::vector misses some useful API for this task, e.g. appending a range of data to the vector when data is received. With C++23 the std::vector got an append_range method. Something like that would also be helpful for the iox::vector.

Thinking further about this, it might easily get out of hands. Maybe we need more useful intermediate steps. Eventually we want to serialize to binary instead of ASCII but this is out of scope for now. This means we could still use iox::string as stop gap solution. This has the benefit to already having an API with append. The current serialization probably also works better with iox::string.

On the UnixDomainSocket we would have an API similar to this

template <template <uint64_t> class C, uint64_t N>
expected<void, IpcChannelError> send(const C<N>& buf) const noexcept;

template <template <uint64_t> class C, uint64_t N>
expected<C<N>, IpcChannelError> receive(iox::units::Duration timeout) const noexcept;

// alternatively, although I don't like out-parameters, in this case it might make sense
template <template <uint64_t> class C, uint64_t N>
expected<void, IpcChannelError> receive(C<N>& buf, iox::units::Duration timeout) const noexcept;

There would be a specialization for iox::string in unix_domain_socket.cpp and it would be used like

iox::string<100> sendData {"test"};
iox::posix::UnixDomainSocket socket;
socket.send(sendData);
auto rec = socket.receive<iox::string, 100>(200_ms);
// or with the alternative API
iox::string<100> receiveData;
socket.receive(receiveData, 200_ms);

Independent of the changes required to remove std::string in all the other places, one could then add support for iox::vector. The user can then decide which API to use, the one for ASCII or the one for binary data.

What do you think?

cc @elfenpiff

elBoberido commented 8 months ago

@lucabart97 once #2056 is merged, the applications should not allocate memory at runtime anymore.

Would be great if you have time for the stuff I mentioned above :)

lucabart97 commented 8 months ago

Hi @elBoberido Sorry for the late response, but I'm out for work for 2 weeks. When I'm come back I'll try to do it!

lucabart97 commented 8 months ago

@elBoberido Do you suggest to keep also the old API with std::string or to remove it?

elBoberido commented 8 months ago

@lucabart97 I think once the new API is available and the code is ported, the std::string API can be removed

lucabart97 commented 7 months ago

Hi @elBoberido I tried to do the things that we have decided. Sorry for the late but I'm very busy with my work.

Some comments on what I have done:

Before creating the PR, you could check what I did in this fork because maybe you don't like the new API to the string container or you have some suggestions with ipc_interface_base.

If all is fine, I must understand if you have a script to format/indent the code.

elBoberido commented 7 months ago

@lucabart97 no worries. Take your time. This is supposed to make fun and not to completely exhaust you :)

I'll have a look at it later but AFAIR the iox::string also stores the length of the string internally and adding just a data method without also changing the size might break some invariants. I have an idea how to solve this but don't want to take the fun of discovering it away. If you want, I can tell you though ... and hopefully not forget about it :sweat_smile:

lucabart97 commented 7 months ago

@elBoberido Using the data() method we could avoid a copy, writing directly the incoming buffer from the socket. To avoid issues with the resize, I added autoresize() API to auto-calculate based on the terminator of the new string data.

elBoberido commented 7 months ago

@lucabart97 the general direction looks good. There is the issue with the iox::string I described above so that API needs to change a bit and with that change you probably also need to change the receive(char*, uint64_t, Duration) method a slightly.

I also think that the API I proposed needs changes. I did not look at the current API when proposing the receive API and forgot to add the msgSize parameter. I think your code currently uses the capacity of the string for msgSize but it should be possible to be smaller than the capacity of the string.

The proposal with static_asser(N <= platform::IOX_UDS_SOCKET_MAX_MESSAGE_SIZE) was also not my best one since there is already a runtime check for the actual size of the string and it should be totally fine to use a string with a large capacity but a small string size.

And finally, the proposal of this API was also not good

template <template <uint64_t> class C, uint64_t N>
expected<void, IpcChannelError> send(const C<N>& buf) const noexcept;

Methods cannot be specialized so it cannot be used for the vector implementation and a simple

template <uint64_t N>
expected<void, IpcChannelError> send(const iox::string<N>& buf) const noexcept;

would be sufficient.

I'm also a bit surprised the tests pass but I guess that's by a lucky accident in how the compare method works.

Btw, I love the strings you used for testing ... all glory to the hypnotoad :)

elBoberido commented 7 months ago

@elBoberido Using the data() method we could avoid a copy, writing directly the incoming buffer from the socket. To avoid issues with the resize, I added autoresize() API to auto-calculate based on the terminator of the new string data.

The idea without the copy is good and autoresize() would fix the problem but it is error prone since one must not forget to call it. To make it more robust I would propose to use something like this

unsafe_raw_access([](auto* data) {
    // do stuff with data
});

unsafe_raw_access would then call autoresize after the execution of the lambda. Initially I thought about letting the lambda return the actual size written to data but the idea with autoresize is more robust.

It might make sense to also have a size and capacity as parameter. capacity is required to prevent buffer overflows and it would be convenient and less error prone to c&p errors when it is just a parameter. Similar with size, if one wants to append to data. With this the call would become unsafe_raw_access([](auto* data, auto size, auto capacity){...});

cc @FerdinandSpitzschnueffler what do you think? The string is your domain :). The API is unsafe but with the autoresize proposal of @lucabart97 in combination with the lambda it should not be too error prone.

lucabart97 commented 7 months ago

@elBoberido I suggest moving on with the new string APIs as you suggested, creating tests and a PR, and later I'll create another PR with the additional iox::string APIs for the UnixDomainSocket. What do you think about it?

elBoberido commented 7 months ago

@lucabart97 sounds great

FerdinandSpitzschnueffler commented 7 months ago

@elBoberido Sorry for the late response. I like your idea of unsafe_raw_access calling autoresize as @lucabart97 proposed. @lucabart97 Looking forward to your PR :)

lucabart97 commented 7 months ago

I wrote the new string method, now I'm creating the test. In the documentation I did not find any reference to the ::testing::Test::RecordProperty. Someone could explain to me how to generate it?

elBoberido commented 7 months ago

@lucabart97 it's here https://github.com/eclipse-iceoryx/iceoryx/blob/master/CONTRIBUTING.md#testing

In case your IDE supports generating code snippets from Javascript you can use this code to generate the UUID

function create_UUID() {
    var dt = new Date().getTime();
    var uuid = 'xxxxxxxx-xxxx-4xxx-yxxx-xxxxxxxxxxxx'.replace(/[xy]/g, function(c) {
        var r = (dt + Math.random()*16)%16 | 0;
        dt = Math.floor(dt/16);
        return (c=='x' ? r :(r&0x3|0x8)).toString(16);
    });
    return uuid;
}
lucabart97 commented 7 months ago

Here is my proposal based on your suggestions:

send API:

expected<void, IpcChannelError> send(const std::string& msg) const noexcept;
expected<void, IpcChannelError> timedSend(const std::string& msg, const units::Duration& timeout) const noexcept;
template <uint64_t N>
expected<void, IpcChannelError> send(const iox::string<N>& buf) const noexcept;
template <uint64_t N>
expected<void, IpcChannelError> timedSend(const iox::string<N>& buf, const units::Duration& timeout) const noexcept;
template <uint64_t N>
expected<void, IpcChannelError> send(const iox::vector<std::byte,N>& buf) const noexcept;
template <uint64_t N>
expected<void, IpcChannelError> timedSend(const iox::vector<std::byte,N>& buf, const units::Duration& timeout) const noexcept;

receive API:

expected<std::string, IpcChannelError> receive() const noexcept;
expected<std::string, IpcChannelError> timedReceive(const units::Duration& timeout) const noexcept;
template <uint64_t N>
expected<void, IpcChannelError> receive(iox::string<N>& buf) const noexcept;
template <uint64_t N>
expected<void, IpcChannelError> timedReceive(iox::string<N>& buf, const units::Duration& timeout) const noexcept;
template <uint64_t N>
expected<void, IpcChannelError> receive(iox::vector<std::byte, N>& buf) const noexcept;
template <uint64_t N>
expected<void, IpcChannelError> timedReceive(iox::vector<std::byte, N>& buf, const units::Duration& timeout) const noexcept;

internal API:

expected<void, IpcChannelError> send(const void* msg, uint64_t msgSize, const units::Duration& timeout) const noexcept;
expected<uint64_t, IpcChannelError> receive(void* msg, uint64_t msgSize, const units::Duration& timeout) const noexcept;

I used void to wrap char or std::byte, in this case, the code is less duplicated.

I have another important question, the UnixDomainSocket is made to support only string. There is the NULL_TERMINATOR_SIZE and each packet needs to have the terminator, and it is also sent to the socket.

What do you suggest to do? I can keep this workflow for std::string and iox::string, but the std::byte buffer is not the best choice, right?

elBoberido commented 7 months ago

@lucabart97 I'm currently on travel and will give you some feedback after the weekend :)

elBoberido commented 7 months ago

@lucabart97 the public API looks good. For the internal API it might be possible to achieve something similar with templates but that is an implementation detail and can be decided later. It might be necessary to extend the vector API though to make this possible. Oh, in case you do not yet have that much experience with templates, I could give you some hints when it comes to the implementation :)

Regarding the NULL_TERMINATOR_SIZE. Yes, for std::string and iox::string this can be kept. For std::byte I also share your opinion. This would be counterproductive since it would be used in a binary context. I think we can concentrate on the iox::string implementation for now and add the std::byte later since it won't be used initially. But that's up to you. At least it (most likely) would not result in additional work if it is done later.

You also have to keep in mind that there are three classed which need this adjustments. The UnixDomainSocket, MessageQueue and NamedPipe. All three of them can be used for the IpcChannel typedef and there are tests which run with all of them.

lucabart97 commented 7 months ago

Ok, got it! Maybe is better to start with std::string and iox::string.

elBoberido commented 7 months ago

@lucabart97 yes, divide and conquer makes such changes much more manageable :)