tensorflow / mlir

"Multi-Level Intermediate Representation" Compiler Infrastructure
1.74k stars 259 forks source link

[ROCm] Adding the ROCM backend implementation for GPU dialect + the mlir-rocm-runner utility #169

Closed deven-amd closed 5 years ago

deven-amd commented 5 years ago

This PR adds

The ROCM backend support for AMD GPUs is similar in concept and implementation to the CUDA backend support for NV GPUs.


@joker-eph @whchung

River707 commented 5 years ago

Hi, This is a huge commit. Is it possible for you to split this into multiple, more bite sized, changes? That would help the review, and also make sure that each individual component is well tested and gets the proper review it deserves.

Thanks!

Just noticed that you have nice split commits, whoops.

deven-amd commented 5 years ago

@River707 I have pushed out a new commit that addresses all but one of the code review comments, please re-review.

thanks

joker-eph commented 5 years ago

This is a huge commit. Is it possible for you to split this into multiple, more bite sized, changes? That would help the review, and also make sure that each individual component is well tested and gets the proper review it deserves.

+1: it seems to me that adding the runner is independent from adding the lowering passes.

sherhut commented 5 years ago

I was not sure whether this is ready for review or whether I should wait for the push. Is this ready now and should I review?

deven-amd commented 5 years ago

@joker-eph

I will break this PR into one for each commit, as per your request.

The PR for the first commit (converting return type from i32 to i64) can stand on its own. (Filed PR #171)

The PRs for the other 4 commits (in this PR) will need to be filed sequentially since the later commits assume/require the presence of the former commits.

I will file the PR for the first of those four commits (ConvertKernelFuncToHSACO) after I have addressed all your code review feedback.

Closing out this PR

joker-eph commented 5 years ago

The PRs for the other 4 commits (in this PR) will need to be filed sequentially since the later commits assume/require the presence of the former commits.

If you have the 4 commits, then you can file the 4 PRs at once, assuming you push the first commit to a branch, the 2nd to another branch (rebased on the first one of course), etc. The diff for the second PR will include the first one, but that'll clear itself when the first one merges.