cornell-zhang / hcl-dialect

HeteroCL-MLIR dialect for accelerator design
https://cornell-zhang.github.io/heterocl/index.html
Other
40 stars 17 forks source link

Initial PR for Bfloat Type #13

Closed AlgaPeng closed 2 years ago

chhzh123 commented 2 years ago

@AlgaPeng @XiangyiZhao Since you have created a template type for bfloat and made it parsable by the optimizer, the next steps would be implementing the actual computation rule for bfloat. The following is a rough plan:

  1. Create arithmetic operations for bfloat. You may find your customized bfloat format cannot be recognized by arith.add or other builtin dialects, since they all add type constraints for operands. Thus, we need to first create corresponding operations in our dialects, like hcl.add_bfloat, hcl.mul_bfloat, etc. (Check the fixed-point implementation.) After you add those operations, you can enhance your test program to let it support GEMM with bfloat format.
  2. Lower the hcl dialect to arith dialect. We need a transformation pass to remove those bfloat operations and use basic addf/mulf operations to implement add_bfloat/mul_bfloat. This is where computation comes into place, and one bfloat operation may correspond to multiple lines of builtin operations. You can find a pass example here. It may be somehow confusing here, but I will give some tutorials later.
  3. Codegen for CPU/FPGA. Once you remove the bfloat types and operations, this step is just about the testing.
    • The CPU backend further lowers the code to llvm dialect. Use hcl-opt -jit to see whether it can work on CPU.
    • The FPGA backend directly generates VHLS code. Use hcl-opt -opt with hcl-translate -emit-hlscpp to see whether it can generate correct code for Vivado HLS.
  4. Provide Python binding for bfloat types. By doing this, system programmers may not need to interact with C/C++ code, but can generate operations with bfloat types directly in Python. Check this file to see how to add your types to Python binding. Basically, you only need to specify a new attribute in the class initialization of those arithmetic operations.
  5. Provide Python binding for your transformation pass. Again, your pass will be called in Python, so a Python binding for that is needed. You need to expose the C API to Python. An example can be found here.
  6. Integrate with frontend HeteroCL, which needs to interact with another repository. After you have tested all the facilities at the MLIR level, you can add new Python APIs in HeteroCL, and provide users interface like hcl.BFloat in Python, so that they are able to call something like hcl.placeholder((10,), dtype=hcl.BFloat(5,10)).
chhzh123 commented 2 years ago

@AlgaPeng @XiangyiZhao For 2, you guys can also check @zzzDavid 's fixed-point to integer pass. I think it would provide some hints to do the bfloat arithmetics. Please feel free to comment here if you have any questions.