Open vishal-chandra opened 1 month ago
Hi!
Thanks for using hls4ml and reporting this issue.
We actually currently do not support passing of non-zero tensors for the initial hidden state in recurrent layers in Pytorch (see the description of the PR that introduced them https://github.com/fastmachinelearning/hls4ml/pull/850). Unfortunately, we have not really documented this limitation so far, that needs to be improved. Of course, this feature should also be properly supported, but so far we haven't quite managed to do it.
However, the code should still compile and produce results, just that those results will be numerically incorrect. This test (https://github.com/fastmachinelearning/hls4ml/blob/main/test/pytest/test_recurrent_pytorch.py#L49-L91) for example passes tensors of zeroes and works fine. Once I use non-zero tensors, the output is incorrect, but I can't reproduce your compilation issues. Could you share a minimal model and code you use to invoke hls4ml?
Hi Jan,
Sorry to miss your reply. I actually also can't reproduce the compilation issue with a minimal LSTM-only model, so it must be something with how my larger model is constructed. Was just converting with convert_from_pytorch_model()
and compiling from there.
I see that #1086 mentions a one-line fix to this issue in the pytorch parser -- are hidden states now supported with pytorch?
Hi! Sorry, I tagged this issue into that PR by mistake.
I just made a PR that should address this issue and implements the passing of initial values for the hidden and cell states in LSTMs (and GRUs): https://github.com/fastmachinelearning/hls4ml/pull/1120
Summary
Getting a C++ redefinition error on hidden state input when compiling a pytorch model with an LSTM error
Details
I have a model that contains an LSTM layer with the hidden state to that layer passed into
model.forward
some of the time (default None, same as the torch LSTM layer definition). When I remove this parameter ofmodel.forward
, the model compiles fine. However, with it included, the hidden state input variable is shown fine in the ModelGraph inputs, but is defined twice in the resulting C++ code, leading to a compilation error.Relevant Code Snippet
def forward(self, ..., hidden): ... lstm_out, hidden_out = self.lstm(lstm_input, hidden) ...
Resulting Error
Steps to Reproduce
Working on hls4ml 0.9.0.dev65+gafed23b1 and torch 2.6.0. Create a simple model with an LSTM layer and the hidden state as a parameter to the forward function and compile the model.
Expected behavior
Hidden is listed only once in ModelGraph.inputs and is defined only once in the C++ code.
Actual behavior
Hidden is listed only once in ModelGraph.inputs but is defined twice in the C++ code, leading to a compilation error. I also notice that in the printed network topology, the LSTM layer is listed as having only one input.