This PR enables four previously not supported scenarios:
When the global input tensor directly gets split, the compiler will now add a DuplicateStreams at that point when calling transformations from conversion to hw
When the second input to a MatMul is not constant, this MatMul will be skipped during conversion and no error will be thrown.
Previously an Add node of two streams was converted to the AddStreams node, it is now also possible to convert to a StreamingEltwise Add node.
Previously when having a per tensor quantization, in the QONNX to FINN-ONNX conversion, the threshold array was broadcasted to the number of channels. This is now skipped and instead the per tensor quantization gets passed to the compiler for further processing.
This PR enables four previously not supported scenarios:
DuplicateStreams
at that point when calling transformations from conversion to hwAddStreams
node, it is now also possible to convert to aStreamingEltwise
Add node.