fastmachinelearning / hls4ml

Machine learning on FPGAs using HLS
https://fastmachinelearning.org/hls4ml
Apache License 2.0
1.28k stars 415 forks source link

Compiling PyTorch LSTM with Hidden State Input #1074

Open vishal-chandra opened 1 month ago

vishal-chandra commented 1 month ago

Summary

Getting a C++ redefinition error on hidden state input when compiling a pytorch model with an LSTM error

Details

I have a model that contains an LSTM layer with the hidden state to that layer passed into model.forward some of the time (default None, same as the torch LSTM layer definition). When I remove this parameter of model.forward, the model compiles fine. However, with it included, the hidden state input variable is shown fine in the ModelGraph inputs, but is defined twice in the resulting C++ code, leading to a compilation error.

Relevant Code Snippet

def forward(self, ..., hidden): ... lstm_out, hidden_out = self.lstm(lstm_input, hidden) ...

Resulting Error

Writing HLS project Done In file included from firmware/myproject.cpp:3: firmware/myproject.h:14:14: error: redefinition of parameter 'hidden' input3_t hidden[N_INPUT_1_3N_INPUT_2_3N_INPUT_3_3], result_t layer11_out[N_LAYER_11] ^ firmware/myproject.h:13:102: note: previous declaration is here

Steps to Reproduce

Working on hls4ml 0.9.0.dev65+gafed23b1 and torch 2.6.0. Create a simple model with an LSTM layer and the hidden state as a parameter to the forward function and compile the model.

Expected behavior

Hidden is listed only once in ModelGraph.inputs and is defined only once in the C++ code.

Actual behavior

Hidden is listed only once in ModelGraph.inputs but is defined twice in the C++ code, leading to a compilation error. I also notice that in the printed network topology, the LSTM layer is listed as having only one input.

JanFSchulte commented 1 month ago

Hi!

Thanks for using hls4ml and reporting this issue.

We actually currently do not support passing of non-zero tensors for the initial hidden state in recurrent layers in Pytorch (see the description of the PR that introduced them https://github.com/fastmachinelearning/hls4ml/pull/850). Unfortunately, we have not really documented this limitation so far, that needs to be improved. Of course, this feature should also be properly supported, but so far we haven't quite managed to do it.

However, the code should still compile and produce results, just that those results will be numerically incorrect. This test (https://github.com/fastmachinelearning/hls4ml/blob/main/test/pytest/test_recurrent_pytorch.py#L49-L91) for example passes tensors of zeroes and works fine. Once I use non-zero tensors, the output is incorrect, but I can't reproduce your compilation issues. Could you share a minimal model and code you use to invoke hls4ml?

vishal-chandra commented 2 weeks ago

Hi Jan,

Sorry to miss your reply. I actually also can't reproduce the compilation issue with a minimal LSTM-only model, so it must be something with how my larger model is constructed. Was just converting with convert_from_pytorch_model() and compiling from there.

I see that #1086 mentions a one-line fix to this issue in the pytorch parser -- are hidden states now supported with pytorch?

JanFSchulte commented 3 days ago

Hi! Sorry, I tagged this issue into that PR by mistake.

I just made a PR that should address this issue and implements the passing of initial values for the hidden and cell states in LSTMs (and GRUs): https://github.com/fastmachinelearning/hls4ml/pull/1120