Test the packet allocated from read_copy_pkt_pool is 128 byte aligned.
Test when using in order aligned send/recv, the copy method is always rdma read.
The data sent by runting read protocol should always have a size as 128 multiple. NCCL will always send a 128 multiple size for LL128 protocol, but runting read will only send a segment of the whole message via ibv_send. The test makes sure such segmented size must be 128 multiple.
Add the following unit tests: