Closed jeremysalwen closed 3 years ago
Thanks for reporting this. This is a known issue, which we've already fixed in https://github.com/pytorch/tensorpipe/commit/0f7673ba421928490deeb35a35a01605d3d3273a. We've not yet currently updated TensorPipe's submodule in PyTorch, which is why you're still seeing it in the nightlies. We expect to be able to do so by the end of the week.
Note that a "workaround" would be to update the version of libibverbs on your machine, if that's an option. Anything after v25 (inclusive) should work.
With pytorch 1.7.1, I was able to successfully initialize the RPC context.
With pytorch nightly (1.9.0.dev20210223),
Cuda is working correctly otherwise, e.g. I can run
successfully.