Description

When looking at automatic type inference, reuse factor setting, stream buffer optimization, and eventual oneAPI implementation with task sequences, it became clear that treating separable convolutions as two layers instead of one was easier. The different layers can have different accumulator precisions, reuse factors, etc.

This optimizer converts a SeperableConv*D layer to a DepthwiseConv*D layer followed by a Conv*D layer for the pointwise convolution. (For backends that have an explicit pointwise implementation, a subsequent optimizer changes the Conv*D to PointwiseConv*D.) Layer-wise configurations are also created for the new depthwise and pointwise convolutions so that type inference can be done on the individual layers. Hence, this optimizer should be run before the automatic type inference. (The qonnx PR #979 adds a number of other optimizers than also need to run before the type inference, so this will be a common feature.)

In this PR I added parameters but did not remove any. In particular, reuse factor and accumulator type are ambiguous, and unused in the new implemenation, being split between depthwise and pointwise reuse factors and accumulators. However, if this optimizer is disabled, the old scheme should still work, with care by the user.

I believe this PR also adds support for multiplier factors other than 1, but it's untested. It was motivated by #1008 .

Type of change

Updated implementation that

Note: Please delete options that are not relevant.

[x] Bug fix (non-breaking change that fixes an issue)
[x] Other (Specify) - more maintainable implementation

Tests

This should not cause changes to the standard depth_multiplier=1 separable convolutions not using automatic type inference, so the default tests should be fine. The automatic type inference will be tested in a following PR that makes auto the default.

Checklist

[x] I have read the guidelines for contributing.
[x] I have commented my code, particularly in hard-to-understand areas.
[ ] I have made corresponding changes to the documentation.
[x] My changes generate no new warnings.
[x] I have installed and run pre-commit on the files I edited or added.
[x] I have added tests that prove my fix is effective or that my feature works.

fastmachinelearning / hls4ml

Add an optimizer to replace SeparableConv by Depthwise + Conv (pointwise) #1022

Description

Type of change

Tests

Checklist