erhoo82 commented 3 months ago

Description

Add an option for SM-based P2P send/recv in TP communication overlap.

Type of change

[ ] Documentation change (change only to the documentation, either a fix or a new content)
[ ] Bug fix (non-breaking change which fixes an issue)
[x] New feature (non-breaking change which adds functionality)
[ ] Breaking change (fix or feature that would cause existing functionality to not work as expected)

Please list the changes introduced in this PR:

Add an option to select between CE- and SM-based P2P send/recv in TP communication overlap by use_ce.
Update the argument names in userbuffer interface to avoid naming conflicts

erhoo82 commented 3 months ago

Thanks @timmoon10. Updated the suggested changes.

timmoon10 commented 3 months ago

/te-ci pytorch

erhoo82 commented 3 months ago

@timmoon10 I think the paddle build failure is unrelated issue?

timmoon10 commented 3 months ago

/te-ci pytorch