fastmachinelearning / hls4ml

Machine learning on FPGAs using HLS
https://fastmachinelearning.org/hls4ml
Apache License 2.0
1.17k stars 388 forks source link

Unsupported layer type: SlicingOpLambda #841

Open rkovachfuentes opened 11 months ago

rkovachfuentes commented 11 months ago

Prerequisites

Please make sure to check off these prerequisites before submitting a bug report.

Quick summary

Please give a brief and concise description of the bug.

Unsupported Layer Type error is raised when trying to configure qkeras model (using hls4ml.utils.config_from_keras_model).

Details

Please add to the following sections to describe the bug as accurately as possible.

Steps to Reproduce

Add what needs to be done to reproduce the bug. Add commented code examples and make sure to include the original model files / code, and the commit hash you are working on.

  1. Clone the hls4ml repository

  2. Checkout the master branch, with commit hash: commit dd18adb1d3fb1ac3bf18c2b7feb37f44c10b6262

  3. Reload qkeras model from h5, using: model = qkeras.utils.load_qmodel('/home/rkovachf/hls4ml-tutorial/hls4mltest.h5',custom_objects)

  4. Model architecture is described below:

    def CreateQModel(shape):
    x = x_in = Input(shape)
    x = QConv2D(5,3, activation="quantized_relu(bits = 16, integer = 4, use_sigmoid = 1)", name="conv2d1")(x)
    '''QConv2D(5,3,
        kernel_quantizer="stochastic_ternary",
        bias_quantizer="ternary", name="first_conv2d")(x)'''
    x = QActivation(activation="quantized_relu(bits = 16, integer = 4, use_sigmoid = 1)", 
                    name="relu1")(x)
    
    x_conv = QConv2D(5,3, activation="quantized_relu(bits = 16, integer = 4, use_sigmoid = 1)", name="conv2d2")(x)
    x = QAveragePooling2D(pool_size=(9, 3), strides=None, padding="valid", data_format=None)(x_conv[...,:1])
    y = QAveragePooling2D(pool_size=(3, 17), strides=None, padding="valid", data_format=None)(x_conv[...,1:2])
    cota = QAveragePooling2D(pool_size=(2,2), strides=None, padding="valid", data_format=None)(x_conv[...,2:3])
    cotb = QAveragePooling2D(pool_size=(2,2), strides=None, padding="valid", data_format=None)(x_conv[...,3:4])
    cov = QAveragePooling2D(pool_size=(2,2), strides=None, padding="valid", data_format=None)(x_conv[...,4:5])
    cota = QActivation(activation="quantized_relu(bits = 16, integer = 4, use_sigmoid = 1)", 
                    name="relucota")(cota)
    
    cotb = QActivation(activation="quantized_relu(bits = 16, integer = 4, use_sigmoid = 1)", 
                    name="relucotb")(cotb)
    
    cota = Flatten()(cota)
    cotb = Flatten()(cotb)
    x = QActivation(activation="quantized_relu(bits = 16, integer = 4, use_sigmoid = 1)", 
                    name="relux")(x)
    
    y = QActivation(activation="quantized_relu(bits = 16, integer = 4, use_sigmoid = 1)", 
                    name="reluy")(y)
    
    x = Flatten()(x)
    y = Flatten()(y)
    x = QDense(units = 10, 
               kernel_quantizer="quantized_bits(bits = 8,integer = 4,symmetric=0)",
               use_bias= True,
               bias_quantizer= "quantized_bits(bits = 8,integer = 4,symmetric=0)",
               name = "dense10x")(x)
    y = QDense(units = 10, 
               kernel_quantizer="quantized_bits(bits = 8,integer = 4,symmetric=0)",
               use_bias= True,
               bias_quantizer= "quantized_bits(bits = 8,integer = 4,symmetric=0)",
               name = "dense10y")(y)
    cota = QDense(units = 10, 
               kernel_quantizer="quantized_bits(bits = 8,integer = 4,symmetric=0)",
               use_bias= True,
               bias_quantizer= "quantized_bits(bits = 8,integer = 4,symmetric=0)",
               name = "dense10cota")(cota)
    cotb = QDense(units = 10, 
               kernel_quantizer="quantized_bits(bits = 8,integer = 4,symmetric=0)",
               use_bias= True,
               bias_quantizer= "quantized_bits(bits = 8,integer = 4,symmetric=0)",
               name = "dense10cotb")(cotb)
    x = QActivation(activation="quantized_relu(bits = 16, integer = 4, use_sigmoid = 1)", 
                    name="relu2x")(x)
    
    y = QActivation(activation="quantized_relu(bits = 16, integer = 4, use_sigmoid = 1)", 
                    name="relu2y")(y)
    
    cota = QActivation(activation="quantized_relu(bits = 16, integer = 4, use_sigmoid = 1)", 
                    name="relu2cota")(cota)
    
    cotb = QActivation(activation="quantized_relu(bits = 16, integer = 4, use_sigmoid = 1)", 
                    name="relu2cotb")(cotb)
    
    x = QDense(units = 2, 
               kernel_quantizer="quantized_bits(bits = 8,integer = 4,symmetric=0)",
               use_bias= True,
               bias_quantizer= "quantized_bits(bits = 8,integer = 4,symmetric=0)",
               name = "dense2x")(x)
    y = QDense(units = 2, 
               kernel_quantizer="quantized_bits(bits = 8,integer = 4,symmetric=0)",
               use_bias= True,
               bias_quantizer= "quantized_bits(bits = 8,integer = 4,symmetric=0)",
               name = "dense2y")(y)
    cota = QDense(units = 2, 
               kernel_quantizer="quantized_bits(bits = 8,integer = 4,symmetric=0)",
               use_bias= True,
               bias_quantizer= "quantized_bits(bits = 8,integer = 4,symmetric=0)",
               name = "dense2cota")(cota)
    cotb = QDense(units = 2, 
               kernel_quantizer="quantized_bits(bits = 8,integer = 4,symmetric=0)",
               use_bias= True,
               bias_quantizer= "quantized_bits(bits = 8,integer = 4,symmetric=0)",
               name = "dense2cotb")(cotb)
    cov = Flatten()(cov)
    cov = QDense(units = 6, 
               kernel_quantizer="quantized_bits(bits = 8,integer = 4,symmetric=0)",
               use_bias= True,
               bias_quantizer= "quantized_bits(bits = 8,integer = 4,symmetric=0)",
               name = "densecov")(cov)
    xy = tf.concat([x[...,:1],y[...,:1],cota[...,:1],cotb[...,:1],
                    x[...,1:2],y[...,1:2],cota[...,1:2],cotb[...,1:2],
                   cov], axis=1)
    model = Model(inputs=x_in, outputs=xy)
    return model
  5. Error occurs:

    
    Exception                                 Traceback (most recent call last)
    Cell In[23], line 4
      1 import hls4ml
      2 import plotting
    ----> 4 config = hls4ml.utils.config_from_keras_model(model, granularity='name')
      5 config['LayerName']['softmax']['exp_table_t'] = 'ap_fixed<18,8>'
      6 config['LayerName']['softmax']['inv_table_t'] = 'ap_fixed<18,4>'

File ~/.conda/envs/hls4ml-tutorial/lib/python3.10/site-packages/hls4ml/utils/config.py:138, in config_from_keras_model(model, granularity, backend, default_precision, default_reuse_factor) 134 model_arch = json.loads(model.to_json()) 136 reader = hls4ml.converters.KerasModelReader(model) --> 138 layerlist, , _ = hls4ml.converters.parse_keras_model(model_arch, reader) 140 def make_layer_config(layer): 141 cls_name = layer['class_name']

File ~/.conda/envs/hls4ml-tutorial/lib/python3.10/site-packages/hls4ml/converters/keras_to_hls.py:226, in parse_keras_model(model_arch, reader) 224 for keras_layer in layer_config: 225 if keras_layer['class_name'] not in supported_layers: --> 226 raise Exception('ERROR: Unsupported layer type: {}'.format(keras_layer['class_name'])) 228 output_shapes = {} 229 output_shape = None

Exception: ERROR: Unsupported layer type: SlicingOpLambda



## Optional

### Possible fix
If you already know where the issue stems from, or you have a hint please let us know.

Please add support for SlicingOpLambda.

### Additional context
Add any other context about the problem here.
lgray commented 11 months ago

@jmitrevs roping you in on this one :)

jmitrevs commented 11 months ago

I will take a look, but one possibility is to use the ExtensionAPI (https://fastmachinelearning.org/hls4ml/advanced/extension.html) to add support for the layer.

lgray commented 11 months ago

Since this is a basic layer in keras (slicing another layer's output), the extension api looks like a good workaround but not the correct solution.

jmduarte commented 11 months ago

hi @lgray, @rkovachfuentes, I made a simpler minimal example:

from tensorflow.keras.layers import Input, Concatenate
from tensorflow.keras.models import Model
import hls4ml

def createModel(shape=(3, 3, 2)):
    x = x_in = Input(shape)
    xy = Concatenate(axis=1)([x[...,:1], x[...,1:]])
    model = Model(inputs=x_in, outputs=xy)
    return model

model = createModel()
model.save('model.h5')
config = hls4ml.utils.config_from_keras_model(model, granularity='name')

https://gist.github.com/jmduarte/08eee28b979c60cfac513b0307b41d4d#file-test-py-L1-L13

This simple model gets two SlicingOpLambda operators.

Screenshot 2023-07-31 at 4 27 02 PM

Up until now, we haven't tried to handle any Lambda or TFOp layers in Keras models because, in general, they can be quite arbitrarily defined. That being said, I can imagine supporting a subset of slicing operators, especially since they are frequently needed.

The easiest way I can imagine to implement this is to parse these layers and then insert an HLS function that just slices the input to get the required output. This will never be optimal because you just insert some latency to create a new output with fewer values.

Also, if like in this example, you actually are rearranging the data, that also can always be handled more efficiently by rewriting your model or using a custom layer to do the rearrangement precisely the way you want.

So from a hardware-algorithm codesign perspective, there will always be a better way to do it than my simple generic implementation. So it's more complicated than it looks at first glance.

I'm happy to chat about it more, especially to find out more about your specific model's needs. I think there are two separate issues with probably two separate solutions.

(1) Supporting slicing in general (with potentially suboptimal performance) (2) Supporting and getting good performance for your specific model

Maybe @vloncar has thoughts as well.

lgray commented 11 months ago

Ah this is already super useful! There's definitely the possibility for remediation on the model design. This is the first one we got to train well with a rather small number of parameters.

I'll send you a mail for a chat with @rkovachfuentes, and Jennet. We can walk you through what we're up to.