openucx / ucx

Unified Communication X (mailing list - https://elist.ornl.gov/mailman/listinfo/ucx-group)
http://www.openucx.org
Other
1.13k stars 423 forks source link

RNDV Transfer from Host Memory to CUDA Managed Memory #9975

Closed J-StrawHat closed 3 months ago

J-StrawHat commented 3 months ago

Describe the bug

Is it possible to use Rendezvous protocol to transfer data from host memory on a node (without GPUs) to CUDA managed memory on another node (with GPUs)?

Steps to Reproduce

Setup and versions

yosefe commented 3 months ago

@J-StrawHat such asymmetric configuration is currently not supported: the client is not supporting cuda memory so it's not able to figure a right response to the RTR message. We will aim to improve it in further releases.

J-StrawHat commented 3 months ago

Thank you for your response. Additionally, I would like to inquire about the best practices for handling this asymmetric configuration(host memory -> CUDA managed memory) in the current release version. I tested the Stream API and it seems to support this configuration. Any further recommendations or insights would be greatly appreciated.

yosefe commented 3 months ago

I'd suggest trying to set UCX_RNDV_SCHEME=get_zcopy or UCX_RNDV_THRESH=inf

J-StrawHat commented 3 months ago

Thank you so much

yosefe commented 3 months ago

@J-StrawHat just to clarify, did any of the suggestion help, and if yes, which one?

J-StrawHat commented 3 months ago

After conducting several tests, I found that setting UCX_RNDV_SCHEME=get_zcopy still results in the same error. However, setting UCX_RNDV_THRESH=inf allows the program to run correctly. Additionally, compared to the Stream API, it demonstrates lower latency for large data transfers.