@AlgaPeng @XiangyiZhao Since you have created a template type for bfloat and made it parsable by the optimizer, the next steps would be implementing the actual computation rule for bfloat. The following is a rough plan:
Create arithmetic operations for bfloat. You may find your customized bfloat format cannot be recognized by arith.add or other builtin dialects, since they all add type constraints for operands. Thus, we need to first create corresponding operations in our dialects, like hcl.add_bfloat, hcl.mul_bfloat, etc. (Check the fixed-point implementation.) After you add those operations, you can enhance your test program to let it support GEMM with bfloat format.
Lower the hcl dialect to arith dialect. We need a transformation pass to remove those bfloat operations and use basic addf/mulf operations to implement add_bfloat/mul_bfloat. This is where computation comes into place, and one bfloat operation may correspond to multiple lines of builtin operations. You can find a pass example here. It may be somehow confusing here, but I will give some tutorials later.
Codegen for CPU/FPGA. Once you remove the bfloat types and operations, this step is just about the testing.
The CPU backend further lowers the code to llvm dialect. Use hcl-opt -jit to see whether it can work on CPU.
The FPGA backend directly generates VHLS code. Use hcl-opt -opt with hcl-translate -emit-hlscpp to see whether it can generate correct code for Vivado HLS.
Provide Python binding for bfloat types. By doing this, system programmers may not need to interact with C/C++ code, but can generate operations with bfloat types directly in Python. Check this file to see how to add your types to Python binding. Basically, you only need to specify a new attribute in the class initialization of those arithmetic operations.
Provide Python binding for your transformation pass. Again, your pass will be called in Python, so a Python binding for that is needed. You need to expose the C API to Python. An example can be found here.
Integrate with frontend HeteroCL, which needs to interact with another repository. After you have tested all the facilities at the MLIR level, you can add new Python APIs in HeteroCL, and provide users interface like hcl.BFloat in Python, so that they are able to call something like hcl.placeholder((10,), dtype=hcl.BFloat(5,10)).
@AlgaPeng @XiangyiZhao For 2, you guys can also check @zzzDavid 's fixed-point to integer pass. I think it would provide some hints to do the bfloat arithmetics. Please feel free to comment here if you have any questions.
@AlgaPeng @XiangyiZhao Since you have created a template type for
bfloat
and made it parsable by the optimizer, the next steps would be implementing the actual computation rule forbfloat
. The following is a rough plan:bfloat
. You may find your customizedbfloat
format cannot be recognized byarith.add
or other builtin dialects, since they all add type constraints for operands. Thus, we need to first create corresponding operations in our dialects, likehcl.add_bfloat
,hcl.mul_bfloat
, etc. (Check the fixed-point implementation.) After you add those operations, you can enhance your test program to let it support GEMM withbfloat
format.hcl
dialect toarith
dialect. We need a transformation pass to remove thosebfloat
operations and use basicaddf
/mulf
operations to implementadd_bfloat
/mul_bfloat
. This is where computation comes into place, and onebfloat
operation may correspond to multiple lines of builtin operations. You can find a pass example here. It may be somehow confusing here, but I will give some tutorials later.bfloat
types and operations, this step is just about the testing.llvm
dialect. Usehcl-opt -jit
to see whether it can work on CPU.hcl-opt -opt
withhcl-translate -emit-hlscpp
to see whether it can generate correct code for Vivado HLS.bfloat
types. By doing this, system programmers may not need to interact with C/C++ code, but can generate operations withbfloat
types directly in Python. Check this file to see how to add your types to Python binding. Basically, you only need to specify a new attribute in the class initialization of those arithmetic operations.hcl.BFloat
in Python, so that they are able to call something likehcl.placeholder((10,), dtype=hcl.BFloat(5,10))
.