Xilinx / finn

Dataflow compiler for QNN inference on FPGAs
https://xilinx.github.io/finn
BSD 3-Clause "New" or "Revised" License
681 stars 218 forks source link

Introduce support for generic elementwise binary operations #1040

Open iksnagreb opened 2 months ago

iksnagreb commented 2 months ago

This includes a set of HWCustomOp and HLSBackend operator templates which can be specialized in just a few lines of code to implement arbitrary elementwise binary operations, like Add, Mul, Sub, Div, And, Equal, etc., supporting multidirectional broadcasting. Concrete implementations for most of these operators according to standard ONNX is already sketched out. Still missing are specializations for accumulator and weight bit-width minimization and some tricky to implement operators. Also still missing is floating-point support due to HLS-backend limitations, though these seem to be just minor defects regarding flatten and Slice.

Adds unit tests in Python, C++ and RTL simulation for these new operators, though these are probably not exhaustive enough to validate all edge cases.

Proposes a new scheme for registering and importing custom operators into their corresponding module namespace, i.e., the 'custom_op' dictionary used to lookup operators by ONNX domain.

Implementation Progress

fpjentzsch commented 2 months ago

Looks useful!

you define the op with the format (Identifier, Python, C++, RTL), do you have a use for the "RTL" definition or is this just a preparation for the future addition of an RTL backend?

I wonder how this functionality overlaps with the following existing FINN CustomOPs and if it makes any or all of them obsolete:

iksnagreb commented 2 months ago

you define the op with the format (Identifier, Python, C++, RTL), do you have a use for the "RTL" definition or is this just a preparation for the future addition of an RTL backend?

Yes, it is just intended to make this future-proof, or at least to sketch out a potential future implementation of a generic RTL backend as well. But I am not even sure right now, whether I want to keep this format, as for example, for implementing the Mod and BitShift operator this simple string template is not sufficient, as the implementation depends on node attributes. Maybe I will fully transition this to the already present *_op property methods as these could contain some logic handling different node attributes. I am also not sure whether these backend-specific definitions are at the right place/level of the hierarchy here, on the other hand I want to minimize the number of customization points required to add new specializations of the operator template...

I wonder how this functionality overlaps with the following existing FINN CustomOPs and if it makes any or all of them obsolete:

Hm, it depends, I probably know more once I have sketched out the Infer... method for the new ones. Probably the old ones have much more strict assumptions and sometimes this could actually be intended to map to some special cases and get something more efficient. I intend to reach (almost) full ONNX compliance with the new operators, which could indeed render the AddStreams and StreamingEltwise obsolete - that is actually how I ended up implementing this in the first place: Initially I just wanted to add a new operator to handle element-wise operation (actually addition) of a constant tensor and a stream but though "why not generalize this?"... Regarding the ChannelwiseOp, I am not really sure what this one exactly does, but as it has the Thresholding_Batch as backend, it seems to do something slightly more elaborate than simple element-wise arithmetic or logical operations, so it will probably not be obsolete?

iksnagreb commented 2 months ago

This now relies on cherry-picked commits from https://github.com/Xilinx/finn/pull/901 and https://github.com/Xilinx/finn/pull/1030.