NVIDIA / nccl

Optimized primitives for collective multi-GPU communication
Other
3.28k stars 831 forks source link

How can this be ported to Windows? #1287

Open eabase opened 6 months ago

eabase commented 6 months ago

There seem to have been multiple attempts to port this to windows (for obvious reasons.) However, why these attempts were ignored and (pulls) rejected, is far less obvious.

Since this package see quite limited in extent, it should be straight forward to make a windows pip package from this.

  1. What would be the best way to go about such an undertaking?
  2. Are you willing to host it in this repo or not?
  3. Is anyone from this community willing to give some hints how to go about this, and perhaps help out for a quicker start?

Attempted pulls

Other Repos:

People:

Related:

cyyever commented 6 months ago

@eabase while I ported it to CMake, Windows native build using MSVC didn't work due to some static assertion failures, which relied on the object memory layouts generated in gcc. And I suspected that even I could change the code to make it work under Windows, without official support it would break again in some future releases. This is why I gave up the attempts on Windows.

eabase commented 6 months ago

@cyyever What about using MSYS2? I think that both Bezel is using that, and subsequently conda. If MSYS is used, I believe it could be very similar to using WSL.

cyyever commented 6 months ago

@cyyever What about using MSYS2? I think that both Bezel is using that, and subsequently conda. If MSYS is used, I believe it could be very similar to using WSL.

For msys2,I think you don't need cmake. For example, FFMPEG can find cuda on msys2 to use nvcodec (I successfully built one before) . So nccl should work in theory.

cyyever commented 6 months ago

@cyyever What about using MSYS2? I think that both Bezel is using that, and subsequently conda. If MSYS is used, I believe it could be very similar to using WSL.

In addition, you should use mingw64 rather than msys2