ROCm / MITuna

MIT License
7 stars 0 forks source link

Initial implementation of GEMM support. #932

Closed pcf000 closed 10 months ago

pcf000 commented 10 months ago

GEMM support. Basic idea is to put op-specific functions on op-specific classes, and subclass the interface class so the common stuff doesn't have to care.

Abstraction in results classes. Initial export-as-tsv method. Implement the load-factor idea, controlling how many workers per GPU, up and down. Pick up the more detailed arch from ISA listing, if available; rocmlir expects it. Add rocmlir/export_configs.py script. Adjust for the new num_cu field in tuner output. Note how to get mlir revision. Add --append to export_configs.py. Make some output more like existing tuningRunner.py output. Tweak the convolution config-string output to match existing output a little better. Use 'tenacity' library for better retry loop on job errors.

alexandraBara commented 10 months ago

@pcf000 this is some great work! As I am reviewing it, i noticed that all the tuna/rocmmlir .py files are missing pylint. We strive to cover any py file under pylint to avoid possible bugs, oversight, code duplication etc.

Please see the pylint lines in vars/utils.groovy for examples on how to run pylint.

pcf000 commented 10 months ago

Please see the pylint lines in vars/utils.groovy for examples on how to run pylint.

@alexandraBara , thanks. If I'm reading it right, rocmlir is included in the first pylint call in runLint. I also run it by hand before making a pull request, because I read the README and am trying to be a good guest.

alexandraBara commented 10 months ago

Please see the pylint lines in vars/utils.groovy for examples on how to run pylint.

@alexandraBara , thanks. If I'm reading it right, rocmlir is included in the first pylint call in runLint. I also run it by hand before making a pull request, because I read the README and am trying to be a good guest.

I must have missed that, that should do it

pcf000 commented 10 months ago

Sorry about the barrage of commits. A lot of my testing used the script I'll call from our CI, and it necessarily had to clone MITuna, which meant I had to push patches.

I think I've done all the requests, aside from moving a couple of function-scope constants out to class scope. I may still do that.

alexandraBara commented 10 months ago

@pcf000 I have looked at our internal guidelines on copyrights. The year should only be updated to something like: 2022-2023 if there have been substantial changes to the file. Otherwise the year should be left as is. Please undo copyright changes :)

pcf000 commented 10 months ago

No problem, reverted. At least it (a) served as a ping and (b) got CI to pass fin-find-eval.

alexandraBara commented 10 months ago

The CI fails dues to some permission errors on that node, we will work on fixing that