Closed psistakis closed 2 years ago
Hi Antonis:
enqueue_request
will, in the common case, place some or all packets of the request on the wire. If the request window (by default 8 requests) is full, or if the connection is congested, the request will be queued (see https://github.com/erpc-io/eRPC/blob/d35a86dcf92757b77ff187f15f7bf67a4ebc0221/src/rpc_impl/rpc_req.cc#L70) and subsequently dequeued by the event loop.
In the current implementation, the receiver must send a response. The response acts as an implicit ACK so this behavior is difficult to alter.
The current implementation does not allow multiple responses to a request (although this is a useful feature that people have asked for). Each request currently has a unique request number that's matched in the response. If the client receives two responses, it will assume that the second response's packets are duplicates and drop them.
Hi Dr Kalia,
Thanks for the response --I appreciate it.
Best wishes and kind regards, Antonis
Hi Dr Kalia,
I have a few more questions related to this issue:
In the current implementation, the receiver must send a response. The response acts as an implicit ACK so this behavior is difficult to alter.)
shouldn't we expect two responses?
Please feel free to correct me if I have misunderstood something.
Thank you.
Rpc
object from different threads. But it also might not since the code unfortunately uses some thread-local variables that can cause issues. Another approach could be to call run_event_loop_once()
, which just runs one iteration of the event loop and returns immediately if there's no Rx/Tx work to be done.run_event_loop_once()
might be an optionNexus
object. Please see https://github.com/erpc-io/eRPC/blob/d35a86dcf92757b77ff187f15f7bf67a4ebc0221/apps/masstree_analytics/masstree_analytics.cc#L409 for an example.Hello Dr Kalia.
Case A: Using the code below, I receive two responses:
enqueue_request();
cl_cntx->rpc_->run_event_loop(200);
enqueue_request();
cl_cntx->rpc_->run_event_loop(200);
Case B: Using the code below, I receive only one response:
enqueue_request();
enqueue_request();
cl_cntx->rpc_->run_event_loop(200);
Am I missing something? For case B, I also tried changing the loop event time from 200ms to 500ms (and 1000ms), but with no luck so far when it comes to the number of responses received.
** enqueue_request()
input parameters are passed as here (https://github.com/erpc-io/eRPC/blob/d35a86dcf92757b77ff187f15f7bf67a4ebc0221/apps/latency/latency.cc#L113).
Just to make sure I undestand --are you suggesting the following (or something similar)?: a) Enqueue a new eRPC request b) Call run_event_loop_once() to send the request c) Create/allow another thread to do some other work in the local node (in the background) d) Then, call run_event_loop(200) to busy wait for the response Would the above work using eRPC?
I checked the code of run_event_loop_once()
. Assuming someone needs a method that returns immediately after a response is received (and handled): Do you think it would be okay eRPC-wise for someone to modify the code and exit the event loop (run_event_loop(X)
instead of run_event_loop_once()
) when process_comps_st()
(https://github.com/erpc-io/eRPC/blob/d35a86dcf92757b77ff187f15f7bf67a4ebc0221/src/rpc_impl/rpc_rx.cc#L6) finishes, instead of busy waiting until the end of the X milliseconds?
Thank you for pointing out this example.
Thanks.
Both ways should result in two responses, else there's a bug somewhere. To debug this, you can try rebuilding eRPC with cmake . -DLOG_LEVEL=trace
. Then run the application and paste the contents of /tmp/erpc_trace*
.
"Assuming someone needs a method that returns immediately after a response is received (and handled):" The way I do this is to repeatedly call run_event_loop_once()
, and set a flag in the client's continuation function that causes the calls to run_event_loop_once()
to stop.
Hello.
Regarding point 1:
Both ways should result in two responses, else there's a bug somewhere. To debug this, you can try rebuilding eRPC with cmake . -DLOG_LEVEL=trace. Then run the application and paste the contents of /tmp/erpc_trace*.
, I would like to first share some information of the platform:
InfiniBand
Mellanox ConnectX-4
(if more details are needed, please let me know)MLNX_OFED_LINUX-5.4-1.0.3.0
Ubuntu 4.15.0-154
Please find below the contents of /tmp/erpc_trace*
(I have replaced server name with 'server'). I ran an application in which node1 sends two (2) requests, as mentioned above, that are received from node2, but node1 receives only one response. I provide the trace generated by node1 (if needed I can do the same for node2 but it might be a bit more complicated because at the same time node2 also sends two requests to node1 --not shown below).
36:517538 TRACE: Rpc 1, lsn 0 ('server'): TX [type REQ, dsn 1, reqn 8, pktn 0, msz 8, magic 11]. Slot [num_tx 0, num_rx 0]. 36:517596 TRACE: Rpc 1, lsn 0 ('server'): TX [type REQ, dsn 1, reqn 9, pktn 0, msz 8, magic 11]. Slot [num_tx 0, num_rx 0]. 36:517712 TRACE: Rpc 1, lsn 0 ('server'): RX [type RESP, dsn 0, reqn 8, pktn 0, msz 1112, magic 11]. 36:522609 REORD: Rpc 1, lsn 0 ('server'): Pkt loss suspected for req 9 ([num_tx 1, num_rx 0]). Action: Retransmitting requests. 36:522619 TRACE: Rpc 1, lsn 0 ('server'): TX [type REQ, dsn 1, reqn 9, pktn 0, msz 8, magic 11]. Slot [num_tx 1, num_rx 0]. 36:527608 REORD: Rpc 1, lsn 0 ('server'): Pkt loss suspected for req 9 ([num_tx 1, num_rx 0]). Action: Retransmitting requests. 36:527614 TRACE: Rpc 1, lsn 0 ('server'): TX [type REQ, dsn 1, reqn 9, pktn 0, msz 8, magic 11]. Slot [num_tx 1, num_rx 0]. 36:532608 REORD: Rpc 1, lsn 0 ('server'): Pkt loss suspected for req 9 ([num_tx 1, num_rx 0]). Action: Retransmitting requests. 36:532613 TRACE: Rpc 1, lsn 0 ('server'): TX [type REQ, dsn 1, reqn 9, pktn 0, msz 8, magic 11]. Slot [num_tx 1, num_rx 0]. 36:537608 REORD: Rpc 1, lsn 0 ('server'): Pkt loss suspected for req 9 ([num_tx 1, num_rx 0]). Action: Retransmitting requests. 36:537613 TRACE: Rpc 1, lsn 0 ('server'): TX [type REQ, dsn 1, reqn 9, pktn 0, msz 8, magic 11]. Slot [num_tx 1, num_rx 0]. 36:542608 REORD: Rpc 1, lsn 0 ('server'): Pkt loss suspected for req 9 ([num_tx 1, num_rx 0]). Action: Retransmitting requests. 36:542613 TRACE: Rpc 1, lsn 0 ('server'): TX [type REQ, dsn 1, reqn 9, pktn 0, msz 8, magic 11]. Slot [num_tx 1, num_rx 0]. 36:547608 REORD: Rpc 1, lsn 0 ('server'): Pkt loss suspected for req 9 ([num_tx 1, num_rx 0]). Action: Retransmitting requests. 36:547613 TRACE: Rpc 1, lsn 0 ('server'): TX [type REQ, dsn 1, reqn 9, pktn 0, msz 8, magic 11]. Slot [num_tx 1, num_rx 0]. 36:552609 REORD: Rpc 1, lsn 0 ('server'): Pkt loss suspected for req 9 ([num_tx 1, num_rx 0]). Action: Retransmitting requests. 36:552613 TRACE: Rpc 1, lsn 0 ('server'): TX [type REQ, dsn 1, reqn 9, pktn 0, msz 8, magic 11]. Slot [num_tx 1, num_rx 0]. 36:557609 REORD: Rpc 1, lsn 0 ('server'): Pkt loss suspected for req 9 ([num_tx 1, num_rx 0]). Action: Retransmitting requests. 36:557614 TRACE: Rpc 1, lsn 0 ('server'): TX [type REQ, dsn 1, reqn 9, pktn 0, msz 8, magic 11]. Slot [num_tx 1, num_rx 0]. 36:562609 REORD: Rpc 1, lsn 0 ('server'): Pkt loss suspected for req 9 ([num_tx 1, num_rx 0]). Action: Retransmitting requests. 36:562614 TRACE: Rpc 1, lsn 0 ('server'): TX [type REQ, dsn 1, reqn 9, pktn 0, msz 8, magic 11]. Slot [num_tx 1, num_rx 0]. 36:567609 REORD: Rpc 1, lsn 0 ('server'): Pkt loss suspected for req 9 ([num_tx 1, num_rx 0]). Action: Retransmitting requests. 36:567614 TRACE: Rpc 1, lsn 0 ('server'): TX [type REQ, dsn 1, reqn 9, pktn 0, msz 8, magic 11]. Slot [num_tx 1, num_rx 0]. 36:572609 REORD: Rpc 1, lsn 0 ('server'): Pkt loss suspected for req 9 ([num_tx 1, num_rx 0]). Action: Retransmitting requests. 36:572614 TRACE: Rpc 1, lsn 0 ('server'): TX [type REQ, dsn 1, reqn 9, pktn 0, msz 8, magic 11]. Slot [num_tx 1, num_rx 0]. 36:577609 REORD: Rpc 1, lsn 0 ('server'): Pkt loss suspected for req 9 ([num_tx 1, num_rx 0]). Action: Retransmitting requests. 36:577614 TRACE: Rpc 1, lsn 0 ('server'): TX [type REQ, dsn 1, reqn 9, pktn 0, msz 8, magic 11]. Slot [num_tx 1, num_rx 0]. 36:582609 REORD: Rpc 1, lsn 0 ('server'): Pkt loss suspected for req 9 ([num_tx 1, num_rx 0]). Action: Retransmitting requests. 36:582614 TRACE: Rpc 1, lsn 0 ('server'): TX [type REQ, dsn 1, reqn 9, pktn 0, msz 8, magic 11]. Slot [num_tx 1, num_rx 0]. 36:587609 REORD: Rpc 1, lsn 0 ('server'): Pkt loss suspected for req 9 ([num_tx 1, num_rx 0]). Action: Retransmitting requests. 36:587614 TRACE: Rpc 1, lsn 0 ('server'): TX [type REQ, dsn 1, reqn 9, pktn 0, msz 8, magic 11]. Slot [num_tx 1, num_rx 0]. 36:592609 REORD: Rpc 1, lsn 0 ('server'): Pkt loss suspected for req 9 ([num_tx 1, num_rx 0]). Action: Retransmitting requests. 36:592615 TRACE: Rpc 1, lsn 0 ('server'): TX [type REQ, dsn 1, reqn 9, pktn 0, msz 8, magic 11]. Slot [num_tx 1, num_rx 0]. 36:597610 REORD: Rpc 1, lsn 0 ('server'): Pkt loss suspected for req 9 ([num_tx 1, num_rx 0]). Action: Retransmitting requests. 36:597616 TRACE: Rpc 1, lsn 0 ('server'): TX [type REQ, dsn 1, reqn 9, pktn 0, msz 8, magic 11]. Slot [num_tx 1, num_rx 0]. 36:602610 REORD: Rpc 1, lsn 0 ('server'): Pkt loss suspected for req 9 ([num_tx 1, num_rx 0]). Action: Retransmitting requests. 36:602615 TRACE: Rpc 1, lsn 0 ('server'): TX [type REQ, dsn 1, reqn 9, pktn 0, msz 8, magic 11]. Slot [num_tx 1, num_rx 0]. 36:607610 REORD: Rpc 1, lsn 0 ('server'): Pkt loss suspected for req 9 ([num_tx 1, num_rx 0]). Action: Retransmitting requests. 36:607615 TRACE: Rpc 1, lsn 0 ('server'): TX [type REQ, dsn 1, reqn 9, pktn 0, msz 8, magic 11]. Slot [num_tx 1, num_rx 0]. 36:612610 REORD: Rpc 1, lsn 0 ('server'): Pkt loss suspected for req 9 ([num_tx 1, num_rx 0]). Action: Retransmitting requests. 36:612615 TRACE: Rpc 1, lsn 0 ('server'): TX [type REQ, dsn 1, reqn 9, pktn 0, msz 8, magic 11]. Slot [num_tx 1, num_rx 0]. 36:617610 REORD: Rpc 1, lsn 0 ('server'): Pkt loss suspected for req 9 ([num_tx 1, num_rx 0]). Action: Retransmitting requests. 36:617615 TRACE: Rpc 1, lsn 0 ('server'): TX [type REQ, dsn 1, reqn 9, pktn 0, msz 8, magic 11]. Slot [num_tx 1, num_rx 0]. 36:622610 REORD: Rpc 1, lsn 0 ('server'): Pkt loss suspected for req 9 ([num_tx 1, num_rx 0]). Action: Retransmitting requests. 36:622615 TRACE: Rpc 1, lsn 0 ('server'): TX [type REQ, dsn 1, reqn 9, pktn 0, msz 8, magic 11]. Slot [num_tx 1, num_rx 0]. 36:627610 REORD: Rpc 1, lsn 0 ('server'): Pkt loss suspected for req 9 ([num_tx 1, num_rx 0]). Action: Retransmitting requests. 36:627615 TRACE: Rpc 1, lsn 0 ('server'): TX [type REQ, dsn 1, reqn 9, pktn 0, msz 8, magic 11]. Slot [num_tx 1, num_rx 0]. 36:632610 REORD: Rpc 1, lsn 0 ('server'): Pkt loss suspected for req 9 ([num_tx 1, num_rx 0]). Action: Retransmitting requests. 36:632615 TRACE: Rpc 1, lsn 0 ('server'): TX [type REQ, dsn 1, reqn 9, pktn 0, msz 8, magic 11]. Slot [num_tx 1, num_rx 0]. 36:637610 REORD: Rpc 1, lsn 0 ('server'): Pkt loss suspected for req 9 ([num_tx 1, num_rx 0]). Action: Retransmitting requests. 36:637615 TRACE: Rpc 1, lsn 0 ('server'): TX [type REQ, dsn 1, reqn 9, pktn 0, msz 8, magic 11]. Slot [num_tx 1, num_rx 0]. 36:642610 REORD: Rpc 1, lsn 0 ('server'): Pkt loss suspected for req 9 ([num_tx 1, num_rx 0]). Action: Retransmitting requests. 36:642615 TRACE: Rpc 1, lsn 0 ('server'): TX [type REQ, dsn 1, reqn 9, pktn 0, msz 8, magic 11]. Slot [num_tx 1, num_rx 0]. 36:647611 REORD: Rpc 1, lsn 0 ('server'): Pkt loss suspected for req 9 ([num_tx 1, num_rx 0]). Action: Retransmitting requests. 36:647616 TRACE: Rpc 1, lsn 0 ('server'): TX [type REQ, dsn 1, reqn 9, pktn 0, msz 8, magic 11]. Slot [num_tx 1, num_rx 0]. 36:652611 REORD: Rpc 1, lsn 0 ('server'): Pkt loss suspected for req 9 ([num_tx 1, num_rx 0]). Action: Retransmitting requests. 36:652616 TRACE: Rpc 1, lsn 0 ('server'): TX [type REQ, dsn 1, reqn 9, pktn 0, msz 8, magic 11]. Slot [num_tx 1, num_rx 0]. 36:657611 REORD: Rpc 1, lsn 0 ('server'): Pkt loss suspected for req 9 ([num_tx 1, num_rx 0]). Action: Retransmitting requests. 36:657619 TRACE: Rpc 1, lsn 0 ('server'): TX [type REQ, dsn 1, reqn 9, pktn 0, msz 8, magic 11]. Slot [num_tx 1, num_rx 0]. 36:662611 REORD: Rpc 1, lsn 0 ('server'): Pkt loss suspected for req 9 ([num_tx 1, num_rx 0]). Action: Retransmitting requests. 36:662616 TRACE: Rpc 1, lsn 0 ('server'): TX [type REQ, dsn 1, reqn 9, pktn 0, msz 8, magic 11]. Slot [num_tx 1, num_rx 0]. 36:667611 REORD: Rpc 1, lsn 0 ('server'): Pkt loss suspected for req 9 ([num_tx 1, num_rx 0]). Action: Retransmitting requests. 36:667616 TRACE: Rpc 1, lsn 0 ('server'): TX [type REQ, dsn 1, reqn 9, pktn 0, msz 8, magic 11]. Slot [num_tx 1, num_rx 0]. 36:672611 REORD: Rpc 1, lsn 0 ('server'): Pkt loss suspected for req 9 ([num_tx 1, num_rx 0]). Action: Retransmitting requests. 36:672616 TRACE: Rpc 1, lsn 0 ('server'): TX [type REQ, dsn 1, reqn 9, pktn 0, msz 8, magic 11]. Slot [num_tx 1, num_rx 0]. 36:677611 REORD: Rpc 1, lsn 0 ('server'): Pkt loss suspected for req 9 ([num_tx 1, num_rx 0]). Action: Retransmitting requests. 36:677618 TRACE: Rpc 1, lsn 0 ('server'): TX [type REQ, dsn 1, reqn 9, pktn 0, msz 8, magic 11]. Slot [num_tx 1, num_rx 0]. 36:682611 REORD: Rpc 1, lsn 0 ('server'): Pkt loss suspected for req 9 ([num_tx 1, num_rx 0]). Action: Retransmitting requests. 36:682618 TRACE: Rpc 1, lsn 0 ('server'): TX [type REQ, dsn 1, reqn 9, pktn 0, msz 8, magic 11]. Slot [num_tx 1, num_rx 0]. 36:687611 REORD: Rpc 1, lsn 0 ('server'): Pkt loss suspected for req 9 ([num_tx 1, num_rx 0]). Action: Retransmitting requests. 36:687616 TRACE: Rpc 1, lsn 0 ('server'): TX [type REQ, dsn 1, reqn 9, pktn 0, msz 8, magic 11]. Slot [num_tx 1, num_rx 0]. 36:692611 REORD: Rpc 1, lsn 0 ('server'): Pkt loss suspected for req 9 ([num_tx 1, num_rx 0]). Action: Retransmitting requests. 36:692616 TRACE: Rpc 1, lsn 0 ('server'): TX [type REQ, dsn 1, reqn 9, pktn 0, msz 8, magic 11]. Slot [num_tx 1, num_rx 0]. 36:697612 REORD: Rpc 1, lsn 0 ('server'): Pkt loss suspected for req 9 ([num_tx 1, num_rx 0]). Action: Retransmitting requests. 36:697616 TRACE: Rpc 1, lsn 0 ('server'): TX [type REQ, dsn 1, reqn 9, pktn 0, msz 8, magic 11]. Slot [num_tx 1, num_rx 0]. 36:702612 REORD: Rpc 1, lsn 0 ('server'): Pkt loss suspected for req 9 ([num_tx 1, num_rx 0]). Action: Retransmitting requests. 36:702617 TRACE: Rpc 1, lsn 0 ('server'): TX [type REQ, dsn 1, reqn 9, pktn 0, msz 8, magic 11]. Slot [num_tx 1, num_rx 0]. 36:707612 REORD: Rpc 1, lsn 0 ('server'): Pkt loss suspected for req 9 ([num_tx 1, num_rx 0]). Action: Retransmitting requests. 36:707617 TRACE: Rpc 1, lsn 0 ('server'): TX [type REQ, dsn 1, reqn 9, pktn 0, msz 8, magic 11]. Slot [num_tx 1, num_rx 0]. 36:712612 REORD: Rpc 1, lsn 0 ('server'): Pkt loss suspected for req 9 ([num_tx 1, num_rx 0]). Action: Retransmitting requests. 36:712617 TRACE: Rpc 1, lsn 0 ('server'): TX [type REQ, dsn 1, reqn 9, pktn 0, msz 8, magic 11]. Slot [num_tx 1, num_rx 0].
Regarding 2.
"Assuming someone needs a method that returns immediately after a response is received (and handled):" The way I do this is to repeatedly call run_event_loop_once(), and set a flag in the client's continuation function that causes the calls to run_event_loop_once() to stop.
That's a good point. In fact I was already using this but I thought I should ask about modifying run_event_loop() in case it can be easily done, in order to avoid potentially adding different flags for different requests.
Thank you for your help.
Thanks! I'll look into the trace.
I should say that several of eRPC's sample applications issue multiple pending requests before polling for a response (e.g., https://github.com/erpc-io/eRPC/blob/d35a86dcf92757b77ff187f15f7bf67a4ebc0221/apps/large_rpc_tput/large_rpc_tput.cc#L155). It might be useful to try these.
Thanks --I will also take a look at the examples. One more thing that I have noticed is that when enabling traces for debugging it always appears to miss one response (especially when using 200 as parameter for run_event_loop
--when removing the 'debugging with traces', or increasing the time, e.g., to 1000, sometimes it happens, but not always). I can try increasing the time (>1000) and let you know.
Also, (and sorry for going back and forth): I just started experimenting again with enqueue_request
and I was re-reading your answer above:
enqueue_request will, in the common case, place some or all packets of the request on the wire. If the request window (by default 8 requests) is full, or if the connection is congested, the request will be queued (see eRPC/src/rpc_impl/rpc_req.cc Line 70 in d35a86d if (likely(session->clientinfo.credits_ > 0)) { ) and subsequently dequeued by the event loop.
If I could use an example to understand better: Assuming we have a client that sends only one (1) request and there are no pending requests (nor congestion), would what you say mean that calling only enqueue_request
is sufficient for the request to be sent to the server --i.e., without calling run_event_loop
?
Thank you.
"Assuming we have a client that sends only one (1) request and there are no pending requests (nor congestion), would what you say mean that calling only enqueue_request is sufficient for the request to be sent to the server --i.e., without calling run_event_loop?"
enqueue_request
will place the packets on the wire (see https://github.com/erpc-io/eRPC/blob/d35a86dcf92757b77ff187f15f7bf67a4ebc0221/src/rpc_impl/rpc_req.cc#L71). The packets will likely reach the server (if they're not lost). The client will need to run the event loop to receive the response.
Thanks. I tried the scenario I described above but without running the event loop the request would not reach the remote server --I will try again, checking if it has to do with the number of packets sent. In the scenario they are lost, would running the event loop help for the recovery/retransmission?
Regarding the issue above with the lost responses: I tried increasing the timeout (even beyond 2000ms), but even then the issue sometimes occurs, but not always. Also, I have noticed that this problem always occurs when I have the trace enabled.
Another question I was having these days: Is there a way for a server to send a response to a client's eRPC request, and then send an eRPC request back to the client from the same registered function? Is there an example with that? I checked the apps folder, but I did not manage to find one. (update: I thought this question might require a different issue topic-wise, so I added it here: #79)
Hello Dr @anujkaliaiitd .
Did you maybe have a chance to take a look at the trace?
I still have the problem of enqueuing two (2) requests that receive only (1) response after running the event loop.
Additionally, I have noticed that if I do the following:
then sometimes request A is sent twice (and gets only one response). Is this expected, and if so how could it be avoided?
I would appreciate any feedback on how I could debug this and if you think it is an issue in my code, or if there is anything that I could try to fix these issues.
Thanks.
P.S. Both issues appear more easily (i.e., a small number of requests is sufficient, e.g. 2) in a multi-threaded environment (i.e., 2 client - and 2 server threads for the same nexus per node).
I do not know if it is related to the second problem (duplicate) I mentioned above, but here is another example:
3 nodes (1 client thread each) send 1 eRPC request to each other --so each node sends 2 eRPC requests, the way described above (enq, run_event_once(), enq, run_event_once()).
Node 1: receives Node 2's request twice, but responds both to Node 2 and Node 3 (I don't understand how this can happen) Node 2: receives Node 1's and Node 3's requests, and responds both to Node 1 and Node 3 (normal) Node 3: receives Node 1's and Node 2's requests, and responds both to Node 1 and Node 2 (normal)
Hi. I think it'll be best if you can create minimal examples (similar to the hello_world
application) that reproduce these issues. The communication patterns you've described have been used successfully in various eRPC applications (see the apps
folder), so your issues indicate either a bug in eRPC, or a bug in the application logic.
Hello Dr Kalia.
Thanks for the response.
My understanding, and please correct me if I am wrong, is that all applications in the apps
folder do not have "server" threads being able to send their own eRPC requests (only "client" threads are able to do that) --that is one key difference compared to my application.
Could it be that when both a client and a server thread of a node concurrently process eRPC requests (e.g., enqueue) they might lead to unpredictable results? E.g., one unpredictable outcome would be a client thread dequeuing (=receiving) the same eRPC request twice (which is what I sometimes see)?
There are a few applications that do this in eRPC:
smr
Raft app also does this: A server thread gets requests from clients, and the server thread also issues requests to backups.Hello.
Thanks for the pointers.
I reviewed them and I did not see any significant differences compared to my erpc calls.
Adding increased timeout seemed it was sufficient to start having "simple" eRPC being sent and responded in all 3 nodes (before that I was using run_event_loop_once, or run_event_loop with smaller timeout --working with only 2 nodes there was no such issue).
However, when having more sophisticated registered functions, which for example busy wait for a while or use atomic operations, the eRPC code occasionally seg faults in line https://github.com/erpc-io/eRPC/blob/d35a86dcf92757b77ff187f15f7bf67a4ebc0221/src/nexus_impl/nexus_bg_thread.cc#L32 . I have checked and this issue is not related to the number of background threads being "exhausted" because of busy waiting.
I am using 3 nodes, and 3 background threads in each node. Each nodes sends 50 eRPC requests to the two other nodes. The problem I mention above occurs occasionally, and when it does it is typically in one node (e.g. node 3) when sending eRPC requests (after having successfully sent a few) or responses.
Could you please share your thoughts on why sometimes there could be a seg fault in https://github.com/erpc-io/eRPC/blob/d35a86dcf92757b77ff187f15f7bf67a4ebc0221/src/nexus_impl/nexus_bg_thread.cc#L32 ??
Thanks.
Hello,
First of all, thank you for providing the source code for eRPC as well as maintaining it.
I have a few questions regarding eRPC that I could not find in other issues (hopefully I haven't miss anything).
Thank you.