vloncar commented 1 year ago

Description

This PR adds support for parsing QSeparableConv1D/2D and exxtends the existing implementation to allow specifying the intermediate result of the depthwise step. Otherwise it is tricky to get bit accurate matching. This is mostly motivated by issues observed in Lindsey's model. I've added the tests for 1D and 2D. Supersedes #849.

Type of change

[x] Bug fix (non-breaking change that fixes an issue)
[x] New feature (non-breaking change which adds functionality)
[x] Breaking change (fix or feature that would cause existing functionality to not work as expected)

A "bug fix" in a sense that it fixes the issues observed in conversion of SeparableConv1D (whether Q or not). A "new feature" as it adds support for separable layers from QKeras. "Breaking change" in a sense that the HLS function call is changed to include the type of the intermediate result of depthwise step (implemented in both Vivado and Vitis for io_stream, no other implementations exist ATM).

Tests

There are two tests appended to test_qkeras.py.

Checklist

[x] I have read the guidelines for contributing.
[x] I have commented my code, particularly in hard-to-understand areas.
[x] I have made corresponding changes to the documentation.
[x] My changes generate no new warnings.
[x] I have installed and run pre-commit on the files I edited or added.
[x] I have added tests that prove my fix is effective or that my feature works.

lgray commented 1 year ago

putting this here from fastml slack: The following setup doesn't close, unless I have at least one integer bit in the depthwise and pointwise quantizers.

x_in = Input((13,21,20))
x = QSeparableConv2D(
    5,3,
    depthwise_quantizer=quantized_bits(8, 0, 1, alpha=1),
    pointwise_quantizer=quantized_bits(8, 0, 1, alpha=1),
    bias_quantizer=quantized_bits(8, 0, alpha=1),
)(x_in)
model = Model(inputs=x_in, outputs=x)

config = hls4ml.utils.config_from_keras_model(model, granularity='name', default_precision='fixed<65,33>')
config['LayerName'][list(config["LayerName"].keys())[0]]['Precision']['result'] = 'fixed<8,1>'

print(config)

hls_model = hls4ml.converters.convert_from_keras_model(
    model, hls_config=config, output_dir='minimalrepro_hls4ml/hls4ml_prj', part='xcu250-figd2104-2L-e', io_type="io_stream",
)
hls_model.compile()

data = quantized_bits(8, 0, alpha=1)(np.random.rand(5000,13,21,20)).numpy()

qkeras_out = model.predict(data)
hls_out = hls_model.predict(data)

plt.figure()
plt.scatter(hls_out.flatten(), qkeras_out.flatten(), s=0.2)
min_x = min(np.amin(hls_out), np.amin(qkeras_out))
max_x = max(np.amax(hls_out), np.amax(qkeras_out))
plt.plot([min_x, max_x], [min_x, max_x], c='gray')
plt.xlabel('hls4ml')
plt.ylabel('QKeras');

With quantized_bits(8,1,1) for depthwise/pointwise it closes perfectly. (it would also be nice to know what the exact number of bits needed for the accumulator is, if there's a ~reasonable formula!)

lgray commented 1 year ago

x_in = Input((13,21,20))
x = QSeparableConv2D(
    5,3,
    depthwise_quantizer=quantized_bits(8, 0, alpha=1),
    pointwise_quantizer=quantized_bits(8, 0, alpha=1),
    bias_quantizer=quantized_bits(8, 0, alpha=1),
)(x_in)
model = Model(inputs=x_in, outputs=x)

config = hls4ml.utils.config_from_keras_model(model, granularity='name', default_precision='fixed<65,33>')
config['LayerName'][list(config["LayerName"].keys())[0]]['Precision']['result'] = 'fixed<8,1>'

print(config)

hls_model = hls4ml.converters.convert_from_keras_model(
    model, hls_config=config, output_dir='minimalrepro_hls4ml/hls4ml_prj', part='xcu250-figd2104-2L-e', io_type="io_stream",
)
hls_model.compile()

data = quantized_bits(8, 0, alpha=1)(np.random.rand(5000,13,21,20)).numpy()

qkeras_out = model.predict(data)
hls_out = hls_model.predict(data)

plt.figure()
plt.scatter(hls_out.flatten(), qkeras_out.flatten(), s=0.2)
min_x = min(np.amin(hls_out), np.amin(qkeras_out))
max_x = max(np.amax(hls_out), np.amax(qkeras_out))
plt.plot([min_x, max_x], [min_x, max_x], c='gray')
plt.xlabel('hls4ml')
plt.ylabel('QKeras');

also fails to close.

fastmachinelearning / hls4ml

Support for quantized SeparableConv1D/2D #861

Description

Type of change

Tests

Checklist