fastmachinelearning / hls4ml

Machine learning on FPGAs using HLS
https://fastmachinelearning.org/hls4ml
Apache License 2.0
1.26k stars 407 forks source link

PyTorch: merge layer with a constant input #1082

Open sei-jgwohlbier opened 1 week ago

sei-jgwohlbier commented 1 week ago

Prerequisites

Please make sure to check off these prerequisites before submitting a bug report.

Quick summary

Merge layers that have a constant input do not work with PyTorch.

Details

Steps to Reproduce

  1. Clone the hls4ml repository
  2. Checkout the master branch, with commit hash: [afed23b11d03855c0e31e34c4a1f0e3805d00d7f]
  3. Run this test code.
    
    from pathlib import Path

import numpy as np import os import shutil import torch import torch.nn as nn from torchinfo import summary

from hls4ml.converters import convert_from_pytorch_model from hls4ml.utils.config import config_from_pytorch_model

test_root_path = Path(file).parent

if name == "main":

class test(nn.Module):
    def __init__(self):
        super().__init__()

        self.downsample = nn.AvgPool1d(kernel_size=1, stride=2)

    def forward(self, x):
        d = self.downsample(x)
        p = torch.mul(d,4.3)
        return torch.cat((d, p), dim=-1)

n_in = 2
size_in = 8
n_batch = 2

model = test()
io_type='io_stream'
backend='Vitis'
output_dir = str(test_root_path / f'hls4mlprj_mul_{backend}_{io_type}')
if os.path.exists(output_dir):
    print("delete project dir")
    shutil.rmtree(output_dir)

model.eval()
summary(model, input_size=(n_batch, n_in, size_in))

X_input = np.random.rand(n_batch, n_in, size_in)
with torch.no_grad():
    pytorch_prediction = model(torch.Tensor(X_input)).detach().numpy()

# X_input is channels last
X_input_hls = np.ascontiguousarray(X_input.transpose(0, 2, 1))

# write tb data
ipf = "./tb_input_features.dat"
if os.path.isfile(ipf):
    os.remove(ipf)
np.savetxt(ipf, X_input_hls.flatten(), newline=" ")
opf = "./tb_output_predictions.dat"
if os.path.isfile(opf):
    os.remove(opf)
with open(opf, "ab") as f:
    for p in pytorch_prediction:
        np.savetxt(f, p.flatten(), newline=" ")

config = config_from_pytorch_model(model,
                                   (None, n_in, size_in),
                                   backend=backend,
                                   default_precision='ap_fixed<16,6>',
                                   channels_last_conversion='internal',
                                   transpose_outputs=False)
config['Model']['Strategy'] = 'Resource'
print(config)
print(output_dir)

hls_model = convert_from_pytorch_model(
    model,
    output_dir=output_dir,
    input_data_tb=ipf,
    output_data_tb=opf,
    backend=backend,
    hls_config=config,
    io_type=io_type,
    part='xcvu9p-flga2104-2-e'
)
hls_model.compile()

print("pytorch_prediction")
print(pytorch_prediction)
print("pytorch_prediction.shape: ", end=" ")
print(pytorch_prediction.shape)

# reshape hls prediction to channels last, then transpose, then reshape
# to match .view
hls_prediction = hls_model.predict(X_input_hls)
#hls_prediction = np.transpose(
#    np.reshape(hls_prediction,
#               (n_batch, int(size_in/2)+size_in, n_out)),
#    (0,2,1)
#)

print("hls_prediction")
print(hls_prediction)
print("hls_prediction.shape: ", end=" ")
print(hls_prediction.shape)

rtol = 1.0e-2
atol = 1.0e-2
assert len(pytorch_prediction) == len(hls_prediction), "length mismatch"
assert pytorch_prediction.shape == hls_prediction.shape, "shape mismatch"
for p, h in zip(pytorch_prediction, hls_prediction):
    np.testing.assert_allclose(p,
                               h,
                               rtol=rtol, atol=atol)

# synthesize
hls_model.build(csim=True, synth=True, cosim=True, validation=True)

### Expected behavior
Successful synthesis.

### Actual behavior

========================================================================================== Layer (type:depth-idx) Output Shape Param #

test [2, 2, 8] 12 ├─AvgPool1d: 1-1 [2, 2, 4] --

Total params: 12 Trainable params: 12 Non-trainable params: 0 Total mult-adds (M): 0

Input size (MB): 0.00 Forward/backward pass size (MB): 0.00 Params size (MB): 0.00 Estimated Total Size (MB): 0.00

{'Model': {'Precision': 'ap_fixed<16,6>', 'ReuseFactor': 1, 'ChannelsLastConversion': 'internal', 'TransposeOutputs': False, 'Strategy': 'Resource'}, 'PytorchModel': test( (conv1): Conv1d(2, 2, kernel_size=(3,), stride=(1,), padding=(1,), bias=False) (downsample): AvgPool1d(kernel_size=(1,), stride=(2,), padding=(0,)) ), 'InputShape': (None, 2, 8)} /home/hls4ml-user/work/ewstapp_research/isolate/NETWORK/hls4mlprj_mul_Vitis_io_stream Interpreting Model ... Topology: Layer name: downsample, layer type: AveragePooling1D, input shape: [[None, 2, 8]] Layer name: mul, layer type: Merge, input shape: [[None, 2, 4]] Layer name: cat, layer type: Concatenate, input shape: [[None, 2, 4], [None, 2, 4]] Creating HLS model WARNING: Changing pipeline style to "dataflow". Traceback (most recent call last): File "/home/hls4ml-user/work/ewstapp_research/isolate/NETWORK/test_mul.py", line 81, in hls_model = convert_from_pytorch_model( File "/home/hls4ml-user/miniconda3/envs/hls4ml/lib/python3.10/site-packages/hls4ml/converters/init.py", line 308, in convert_from_pytorch_model return pytorch_to_hls(config) File "/home/hls4ml-user/miniconda3/envs/hls4ml/lib/python3.10/site-packages/hls4ml/converters/pytorch_to_hls.py", line 374, in pytorch_to_hls hls_model = ModelGraph(config, layer_list, inputs=input_layers) File "/home/hls4ml-user/miniconda3/envs/hls4ml/lib/python3.10/site-packages/hls4ml/model/graph.py", line 387, in init self._make_graph(layer_list) File "/home/hls4ml-user/miniconda3/envs/hls4ml/lib/python3.10/site-packages/hls4ml/model/graph.py", line 416, in _make_graph self.graph[name] = self.make_node(kind, name, layer, inputs, outputs) File "/home/hls4ml-user/miniconda3/envs/hls4ml/lib/python3.10/site-packages/hls4ml/model/graph.py", line 503, in make_node node = layer_cls(self, name, attributes, inputs, outputs) File "/home/hls4ml-user/miniconda3/envs/hls4ml/lib/python3.10/site-packages/hls4ml/model/layers.py", line 117, in init self.initialize() File "/home/hls4ml-user/miniconda3/envs/hls4ml/lib/python3.10/site-packages/hls4ml/model/layers.py", line 950, in initialize assert len(self.inputs) == 2



## Optional

### Possible fix
I figured out that the main reason this is happening is because the constant gets embedded in the `torch.fx` node representing the `mul`. I have been working on a fix that includes looking for constants and adding input like layers for the constants. I have gotten it to the point where the constant is represented as a layer, but haven't yet been able to get the actual constant through. I have to put it down for a few days, so I thought I'd post this to see if you think I'm on the right track. My [fork](https://github.com/sei-jgwohlbier/hls4ml/tree/pytorch/tensorconstant) is here if someone wants to have a look. 
JanFSchulte commented 2 days ago

Hi! Thanks for testing this and working on a fix! I think your development goes into a promising direction. From a quick check of your fork, it seems to be like hsl4ml is still treating the new Constant layer like an input layer and is trying to find it's tensor at runtime. I'm not quite sure how best to fix it, but maybe @vloncar can advise.