ROCm / apex

A PyTorch Extension: Tools for easy mixed precision and distributed training in Pytorch
BSD 3-Clause "New" or "Revised" License
17 stars 14 forks source link

moving from rocBLAS to hipBLAS #127

Closed ramcherukuri closed 7 months ago

ramcherukuri commented 7 months ago

I updated all the GEM calls to HIPBLAS based on the PR as a reference.
image

The L0 unit test worked as expected.

pruthvistony commented 7 months ago

Function definitions of

static hipblasStatus_t rocBLASStatusToHIPStatus(rocblas_status error)
static rocblas_operation hipOperationToRocOperation(hipblasOperation_t op)

needs to be removed from csrc/mlp_cuda.cu apex/contrib/csrc/multihead_attn/strided_batched_gemm.cuh

pruthvistony commented 7 months ago

@ramcherukuri Can you remove cutlass submodule

Submodule cutlass updated 6930 files

jithunnair-amd commented 2 months ago

@pruthvistony @ramcherukuri Please use the Squash and Merge option for all PRs except IFUs. It helps in keeping the history linear, easy-to-follow and makes cherry-picks easy.