[DO NOT SQUASH] Use half-precision math library calls

dhernandez0 commented 1 month ago

In this PR we add f16 math library calls.

Refactor LegalizeToF32 to get supported float types and target type as arguments.
Add f16 math library calls to MathToROCDL
Update tests

TODO:

~test MIGraphX model (accuracy)~
~check if there's any performance improvement~ roughly same performance for bert_base_cased_1_fp16_gpu, distilgpt2_1_fp16_gpu and resnet50-v1-7 (--fp16)
upstream PR?

closes ticket: https://github.com/ROCm/rocMLIR-internal/issues/1439

codecov[bot] commented 1 month ago

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 83.55%. Comparing base (c9e737f) to head (f8293de). Report is 10 commits behind head on develop.

Additional details and impacted files

```diff @@ Coverage Diff @@ ## develop #1650 +/- ## =========================================== + Coverage 78.02% 83.55% +5.53% =========================================== Files 98 98 Lines 26316 26321 +5 Branches 3766 3766 =========================================== + Hits 20533 21993 +1460 - Misses 4281 4294 +13 + Partials 1502 34 -1468 ``` | [Flag](https://app.codecov.io/gh/ROCm/rocMLIR/pull/1650/flags?src=pr&el=flags&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=ROCm) | Coverage Δ | | |---|---|---| | [mfma](https://app.codecov.io/gh/ROCm/rocMLIR/pull/1650/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=ROCm) | `83.55% <100.00%> (+5.53%)` | :arrow_up: | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=ROCm#carryforward-flags-in-the-pull-request-comment) to find out more.

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

dhernandez0 commented 1 month ago

There's a discrepancy between the populate() calls and the command-line setup which could be confusing. We might want to resolve that, and I don't have a strong opinion on which direction to go with it. (That is, if we want the populate()s to take an extra set, we'd hardcode F32 and F64 into the type converter). But that's up to you

I understand you mean calling addConversion() for f64 and f32 (and ShapedTypes of f64 and f32)? That seems like duplicating the code to me. I'm not sure how to do this cleanly, another way would be to pass MLIRContext to populateExtendToSupportedTypesConversionTarget() and there just copy the extraSupported vector and add f64 and f32. I'm not even sure why those populate functions are public, they seem internal stages of the pass?

ROCm / rocMLIR

[DO NOT SQUASH] Use half-precision math library calls #1650

Codecov Report