eclipse-iceoryx / iceoryx

Eclipse iceoryx™ - true zero-copy inter-process-communication
https://iceoryx.io
Apache License 2.0
1.58k stars 373 forks source link

1st packet (Sending: 0) missed at the receiver end in icedelivery and singleprocess example. #301

Closed Indra5196 closed 3 years ago

Indra5196 commented 3 years ago

Required information

Operating system: Ubuntu 18.04.5 LTS

Compiler version: GCC 7.4.0

Observed result or behaviour: I created a debug build of iceoryx using cmake 3.12.1. I ran the icedelivery example, by executing the subscriber 1st and then the publisher. I noticed that 0 is not received at the receiver.......the receiver starts receiving from 1. This was the case with both simple and bare_metal example. I found the same issue while running the single_process example.

Then I tried running the process with GDB, by just making a small change, so that debugging becomes a bit easy for me. I commented out /iceoryx/iceoryx_examples/singleprocess/single_process.cpp: line 104, so that the keepRunning flag is always true and threads keep on running. I started moving step by step and found that I received 0 this time.

Now I ran this example with my change, but without GDB. This Time I did NOT receive 0.

budrus commented 3 years ago

Thanks for diving into iceoryx @Indra5196.

The answer is quite simple. There is a discovery loop in the background that connects publishers and subscribers and it can take up to 100ms before there is a connection. Maybe we change this to more event driven one day but that's how it currently is. Now you have some kind of race between offer() and send() on subscriber side and the discovery loop. If you send the first sample before being connected you will loose it. We already had already @evshary stumbling upon the same issue #62

There are various ways to solve this.

Maybe we should think about ensuring that pending subscriptions are active once we return from the offer call of the publisher as you were not the first one who was confused

orecham commented 3 years ago

@surendra210 I think this explains the issue you are seeing too.

@dkroenke This is what we spoke about the other day.

surendra210 commented 3 years ago

@ithier yes this is the same behaviour I am also observing

surendra210 commented 3 years ago

@budrus one observation from my side, in my case where ice publisher and subscriber exchange data over the network via ice2dds like gateway, using iceoryx framework. In this case, delay between offer(outside while) and allocate(inside while) is not helping, only a delay between first allocate(after entering into while(1)) and send, is helping the situation not to loose first packet at subscriber side. And that too it expects a delay of minimum 1000msec. Nevertheless i can try once again with delay between offer and allocate with different delays

budrus commented 3 years ago

@surendra210 I'm not sure if I get this. How long a sample was allocated before it is sent has no influence on the subscriber. When calling send() the subscriber has to be in the container on the publisher side. The entry is made in the next discovery loop after having an offer() on publisher side and a subscribe() on subscriber side. That should be the whole story....

surendra210 commented 3 years ago

@budrus the problem in my case is some configuration inside the gateway discovery loop which is by default 1000msec, that is the reason why i am not able to get the first packet on remote node. Because of this 1000msec discovery loop in gateway, connection between the publisher and the subscriber in the gateway is delayed eventually the data is lost, i can correct this configuration of the discovery loop and can live with delay of 200msec as you suggested for now. Thanks or prompt responses. FYI @ithier

orecham commented 3 years ago

This could be addressed using the history capabilities provided by the new building blocks API.