[BUG] Some data is lost during transmission.

XLzed commented 1 year ago

Describe the bug Some data is lost during transmission，it causes the exception of grpc http2 deframe, and netty benchmark example hangs because of waiting for all data.

Steps to Reproduce

grpc command: ./build/example/install/hadronio/bin/hadronio grpc benchmark -m 10000 -rs 10000 -as 10000 -r 0.0.0.0
netty command: ./build/example/install/hadronio/bin/hadronio netty benchmark throughput -s -l 100000 -m 1000

Additional info

grpc exceptions
- Stream x does not exist
- Frame of type 0 must be associated with a stream.
- INTERNAL: Encountered end-of-stream mid-frame
- Frame length: x exceeds maximum: y
netty benchmark thourghput hangs

fruhland commented 1 year ago

Can you please provide some information on your test system? Especially, which type of network interconnect are you using (Ethernet, InfiniBand, etc.)? The only error I recognize is "Stream x does not exist" from gRPC, but for me, it only occurs on a specific system and the benchmarks work fine on other systems.

XLzed commented 1 year ago

Can you please provide some information on your test system? Especially, which type of network interconnect are you using (Ethernet, InfiniBand, etc.)? The only error I recognize is "Stream x does not exist" from gRPC, but for me, it only occurs on a specific system and the benchmarks work fine on other systems.

I test it locally and the machine have no rdma device, so the examples run with tcp only (I also set UCX_TLS=tcp).

System Info

Linux version 4.19.95-17 (root@runner-857a6918-project-16016-concurrent-0) (gcc version 4.8.5 20150623 (Red Hat 4.8.5-44) (GCC))
openjdk 11.0.16 2022-07-19
OpenJDK Runtime Environment (build 11.0.16+8-post-Ubuntu-0ubuntu118.04)
OpenJDK 64-Bit Server VM (build 11.0.16+8-post-Ubuntu-0ubuntu118.04, mixed mode, sharing)
UCX version：1.13.1
ucx_info：ucx_info.log

Sequence Number Test

I also add an additional seqNumber in the head of message to debug, and find that some messages are lost or not retrieved correctly . Some logs like: [WRN][HadronioSocketChannel] recv sequence number error, required [159], but get [290]

command: ./build/example/install/hadronio/bin/hadronio netty benchmark throughput -s -l 1000 -m 100000 client.log server.log
command: ./build/example/install/hadronio/bin/hadronio grpc benchmark -m 100 -rs 10000 -as 10000 -s grpc-client.log grpc-server.log

I also tested between two machines that supports ROCEv2, but the exception also occurred. Some information of rdma test environment：

Ethernet controller: Mellanox Technologies MT28850
MLNX_OFED_LINUX-5.4-3.4.0.0
rdma-core v35.4

I can use ucx and ibverbs to communicate directly, maybe the logic of tag_send/recv or of RingBuffer cause this problem?

XLzed commented 1 year ago

If I force the sendTaggedMessage to be blocking, the examples works fine.

//      final boolean completed = endpoint.sendTaggedMessage(sendBuffer.memoryAddress() + index, messageLength, tag, true, blocking);
        final boolean completed = endpoint.sendTaggedMessage(sendBuffer.memoryAddress() + index, messageLength, tag, true, true);

fruhland commented 1 year ago

Thanks for the detailed report. I will try to reproduce the issue and have a look into whats going wrong.

XLzed commented 1 year ago

It seems that tag matching semantic is not completed in order strictly. Maybe we have to deal with out-of-order, or use another semantic of UCX? I don't know if the data is still received in the same order as the receive buffer are submitted when the tasks can't complete in order.

fruhland commented 1 year ago

According to this (https://github.com/openucx/ucx/issues/6370), tag matching messages will be received in order.

If I invoke two upc_tag_send_nb on same ep one by one，Will these two send requests will be completed in the invoke order？Does it matter with whether I use RC or not?

They may be completed in a different order, but will be matched in the same order on receiver