Open nvssynthesis opened 1 week ago
Thanks for bringing this up!
It seems like the simplest thing we could do would be to add an "index" parameter to the torch_helpers::loadGRU()
and torch_helpers::loadLSTM()
methods? I guess it would be possible to create a kind of "wrapper" layer that contains several GRU/LSTM layers internally to help streamline the process of using/loading weights for those types of layers. Although that's a little bit outside the scope of what RTNeural has supported up to this point.
I'll have a think about ways to potentially detect errors of that sort in the model-loading process... I'm sure there's more we can do, but I don't think it would be realistic to expect RTNeural to be able to catch all errors of that sort.
Not sure of the best language to describe this succinctly, but in PyTorch, there are modules with the parameter "num_layers", e.g. nn.GRU and nn.LSTM. However, the helpers in torch_helpers seem to assume that this parameter is set to 1.
For example, torch_helpers::loadGRU gets weights and biases whose suffix is always 0. When num_layers > 1, the suffixes will increment. This logic seems doable on the RTNeural end, unless there's some extra complications I'm not aware of.
I believe I can work around this by setting up the network differently in PyTorch, using nn.Sequential to manually stack GRU layers if necessary. Still, this seems important to implement, because I had a few hours where I thought everything was 'working' and the model sounded really bad, but in fact the PyTorch model had a layer's worth of weights and biases that had not been loaded into the RTNeural model.