openucx / ucc

Unified Collective Communication Library
https://openucx.github.io/ucc/
BSD 3-Clause "New" or "Revised" License
177 stars 85 forks source link

BUILD: add support for specific gpu ach with ROCM #987

Closed akolliasAMD closed 1 week ago

akolliasAMD commented 2 weeks ago

What

Added ability to specify ROCM architecture on configure with --with-rocm-arch options are: all which does the same behavior as before, all-arch-no-native does all default rocm architectures with the exception of native (to be used if no rocm enabled gpus exist in the system) --offload-arch=gfx### for specifying a specific architecture

Why ?

This allows specifying a specific architecture. It will fix the https://github.com/openucx/ucc/issues/969 issue

swx-jenkins3 commented 2 weeks ago

Can one of the admins verify this patch?

manjugv commented 2 weeks ago

@akolliasAMD Please also get review from @edgargabriel

edgargabriel commented 2 weeks ago

@akolliasAMD Please also get review from @edgargabriel

@manjugv this looks good to me. I don't think the repo allows me to do a formal review, but we discussed this with @akolliasAMD internally before he filed the PR.

akolliasAMD commented 2 weeks ago

Can you give it another thumbs up for the CI to continue? EDIT: Actually will do a proper rebase and repush sorry

Sergei-Lebedev commented 2 weeks ago

ok to test

akolliasAMD commented 1 week ago

Do you want me to just rebase from master and merge?

Sergei-Lebedev commented 1 week ago

Do you want me to just rebase from master and merge?

yes please

akolliasAMD commented 1 week ago

ok rebased. Will wait for tests to pass and merge hopefully

Sergei-Lebedev commented 1 week ago

jenkins error is unrelated