fastmachinelearning / hls4ml

Machine learning on FPGAs using HLS
https://fastmachinelearning.org/hls4ml
Apache License 2.0
1.24k stars 402 forks source link

Support for quantized SeparableConv1D/2D #861

Closed vloncar closed 12 months ago

vloncar commented 1 year ago

Description

This PR adds support for parsing QSeparableConv1D/2D and exxtends the existing implementation to allow specifying the intermediate result of the depthwise step. Otherwise it is tricky to get bit accurate matching. This is mostly motivated by issues observed in Lindsey's model. I've added the tests for 1D and 2D. Supersedes #849.

Type of change

A "bug fix" in a sense that it fixes the issues observed in conversion of SeparableConv1D (whether Q or not). A "new feature" as it adds support for separable layers from QKeras. "Breaking change" in a sense that the HLS function call is changed to include the type of the intermediate result of depthwise step (implemented in both Vivado and Vitis for io_stream, no other implementations exist ATM).

Tests

There are two tests appended to test_qkeras.py.

Checklist

lgray commented 1 year ago

putting this here from fastml slack: The following setup doesn't close, unless I have at least one integer bit in the depthwise and pointwise quantizers.

x_in = Input((13,21,20))
x = QSeparableConv2D(
    5,3,
    depthwise_quantizer=quantized_bits(8, 0, 1, alpha=1),
    pointwise_quantizer=quantized_bits(8, 0, 1, alpha=1),
    bias_quantizer=quantized_bits(8, 0, alpha=1),
)(x_in)
model = Model(inputs=x_in, outputs=x)

config = hls4ml.utils.config_from_keras_model(model, granularity='name', default_precision='fixed<65,33>')
config['LayerName'][list(config["LayerName"].keys())[0]]['Precision']['result'] = 'fixed<8,1>'

print(config)

hls_model = hls4ml.converters.convert_from_keras_model(
    model, hls_config=config, output_dir='minimalrepro_hls4ml/hls4ml_prj', part='xcu250-figd2104-2L-e', io_type="io_stream",
)
hls_model.compile()

data = quantized_bits(8, 0, alpha=1)(np.random.rand(5000,13,21,20)).numpy()

qkeras_out = model.predict(data)
hls_out = hls_model.predict(data)

plt.figure()
plt.scatter(hls_out.flatten(), qkeras_out.flatten(), s=0.2)
min_x = min(np.amin(hls_out), np.amin(qkeras_out))
max_x = max(np.amax(hls_out), np.amax(qkeras_out))
plt.plot([min_x, max_x], [min_x, max_x], c='gray')
plt.xlabel('hls4ml')
plt.ylabel('QKeras');

image

With quantized_bits(8,1,1) for depthwise/pointwise it closes perfectly. (it would also be nice to know what the exact number of bits needed for the accumulator is, if there's a ~reasonable formula!)

lgray commented 1 year ago
x_in = Input((13,21,20))
x = QSeparableConv2D(
    5,3,
    depthwise_quantizer=quantized_bits(8, 0, alpha=1),
    pointwise_quantizer=quantized_bits(8, 0, alpha=1),
    bias_quantizer=quantized_bits(8, 0, alpha=1),
)(x_in)
model = Model(inputs=x_in, outputs=x)

config = hls4ml.utils.config_from_keras_model(model, granularity='name', default_precision='fixed<65,33>')
config['LayerName'][list(config["LayerName"].keys())[0]]['Precision']['result'] = 'fixed<8,1>'

print(config)

hls_model = hls4ml.converters.convert_from_keras_model(
    model, hls_config=config, output_dir='minimalrepro_hls4ml/hls4ml_prj', part='xcu250-figd2104-2L-e', io_type="io_stream",
)
hls_model.compile()

data = quantized_bits(8, 0, alpha=1)(np.random.rand(5000,13,21,20)).numpy()

qkeras_out = model.predict(data)
hls_out = hls_model.predict(data)

plt.figure()
plt.scatter(hls_out.flatten(), qkeras_out.flatten(), s=0.2)
min_x = min(np.amin(hls_out), np.amin(qkeras_out))
max_x = max(np.amax(hls_out), np.amax(qkeras_out))
plt.plot([min_x, max_x], [min_x, max_x], c='gray')
plt.xlabel('hls4ml')
plt.ylabel('QKeras');

also fails to close. image