UCX 1.14.0 packages make CUDA optional

conda-forge / openmpi-feedstock

A conda-smithy repository for openmpi.

BSD 3-Clause "New" or "Revised" License

9 stars 25 forks source link

UCX 1.14.0 packages make CUDA optional #119

Closed jakirkham closed 9 months ago

jakirkham commented 1 year ago

As part of the UCX 1.14.0 upgrade ( https://github.com/conda-forge/ucx-split-feedstock/pull/111 ), CPU & GPU builds were merged making cudatoolkit optional and ucx-proc unneeded. So it should be possible to use UCX more generally

dalcinl commented 1 year ago

I have no idea how these improvements should affect the openmpi package. @leofang ?

jakirkham commented 1 year ago

Think it comes down to changing this...

https://github.com/conda-forge/openmpi-feedstock/blob/dfc3d61785630b94cd597292fc0c8e66f1ce3887/recipe/meta.yaml#L47-L48

...to this...

 - ucx                 # [enable_cuda]

Though ucx should work on Linux (even without CUDA)

Tried playing with some changes to capture this in PR ( https://github.com/conda-forge/openmpi-feedstock/pull/121 ). Though there may be other valid approaches

Would defer to Leo and you on what makes the most sense here 🙂

dalcinl commented 11 months ago

@jakirkham @minrk Please help me decide what to do in #128.

[ ] From @minrk's comment, looks like I have to remove the requirement ucx-proc =*=gpu.
[ ] What about the cudatoolkit >= {{ cudatoolkit }} requirement? Should I remove it?

dalcinl commented 11 months ago

BTW, I brancked off 4.x. Should we generate fresh builds with updated requirements?

leofang commented 11 months ago

Sorry I completely missed this. Why can't openmpi just depend on ucx unconditionally? ucx can also be used for CPU-only transfers, and based what I learned from @pentschev (IIRC) ucx wouldn't do anything special if the CUDA presence is not detected, meaning we can just do this

# the actual content would be more convolved due to the need of supporting CUDA 11 & 12,
# as they have different package layout. This is for illustration only.
run_constrained:
  - cudatoolkit

Once we decide how to approach this, I can take care of the CUDA part.

leofang commented 11 months ago

I support "backporting" whatever decision that we land to the v4.x branch.

dalcinl commented 11 months ago

Why can't openmpi just depend on ucx unconditionally?

Maybe we will have to do it anyway. v5.0.x has changed the way they do things by default, components are no longer built as plugins, but rather linked to libmpi.so (see discussion in #128)

leofang commented 9 months ago

Closing. In #128 we use UCX unconditionally.

jakirkham commented 9 months ago

Thanks all! 🙏