Closed ross-ca closed 7 months ago
Hello!
For using RTNeural with a specific model like this one, it should be possible to implement the model "by hand":
state_dict
object to a regular JSON file (see here).Now it would be nicer (though more difficult) to automate this process, so that we can "transform" the state_dict
into an RTNeural model file like we're able to get from TensorFlow. The RTNeural model file needs to have the following things:
With that in mind there's a few things about the state_dict
that you shared which bring about a few questions:
state_dict
doesn't seem to show the "type" of each layer? Some layers are labelled "hidden" or "residual", but I'm not sure if those can be translated into things like "Dense" or "Conv1D". Maybe the layer type could be inferred from the shape of the weights?state_dict
. Maybe this network does not use any activations?state_dict
. For example, it seem like the input layer is second to last?It's also worth mentioning that RTNeural currently only supports automated loading for sequential networks. More complex network architectures will require a "by hand" approach like the one mentioned above. If you have suggestions for improving the RTNeural model format, that would be welcome as well!
Hope this information is helpful!
Thank you so much for your reply Jatin, this is really helpful!
Would you be able to explain a little more about the structure of the RTNeural model JSON? How exactly should it be laid out?
Thanks again!
No problem!
For the RTNeural format, I would definitely suggest checkout out this example model file, but the basic format goes:
{
"in_shape": [
null,
null,
1
],
"layers": [
{
"type": "dense", // layer type goes here
"activation": "tanh", // activation type goes here
"shape": [
null,
null,
8 // layer output shape goes here
],
"weights": [
... // layer weights (including biases) go here
]
},
... // the rest of the layers continue on here
]
}
No problem!
For the RTNeural format, I would definitely suggest checkout out this example model file, but the basic format goes:
{ "in_shape": [ null, null, 1 ], "layers": [ { "type": "dense", // layer type goes here "activation": "tanh", // activation type goes here "shape": [ null, null, 8 // layer output shape goes here ], "weights": [ ... // layer weights (including biases) go here ] }, ... // the rest of the layers continue on here ] }
That's perfect, thank you for your help.
In the example model file, I noticed that the final array of weights for each layer is separated from the rest. Is there a reason for this? Are these the bias values rather than the weights?
Thanks again!
In the example model file, I noticed that the final array of weights for each layer is separated from the rest. Is there a reason for this? Are these the bias values rather than the weights?
The order of the weights for each layer is whatever the TensorFlow getWeights()
method returns for a given layer, but yes, I think for most layers that ends up being the layer biases.
In the example model file, I noticed that the final array of weights for each layer is separated from the rest. Is there a reason for this? Are these the bias values rather than the weights?
The order of the weights for each layer is whatever the TensorFlow
getWeights()
method returns for a given layer, but yes, I think for most layers that ends up being the layer biases.
What should the JSON activation function be for layers with a gated activation function, as seen in WaveNet, for example?
RTNeural doesn't currently support gated activations, but it would be great to add that support. Are you familiar with how some existing framework implements gated activations? (For example, most of the layers currently implemented are based on the Keras implementation)
RTNeural doesn't currently support gated activations, but it would be great to add that support. Are you familiar with how some existing framework implements gated activations? (For example, most of the layers currently implemented are based on the Keras implementation)
Hmm, I'm not familiar. Do you have an email where we could speak further? Thanks! :)
Sure thing, feel free to email at jatin@ccrma.stanford.edu
I'm looking for an effective way to export a PyTorch Lightning model to the JSON format that RTNeural accepts. I've seen the previous post about using PyTorch models exported with torchscript, but can't seem to get this to work.
Instead, the PyTorch
state_dict
dictionary appears to contain all the information required to build a JSON file in the format that RTNeural expects. I understand that this will involve writing some kind of parser, but I'm not sure how to go about this. Any help would be massively appreciated. An example of the dictionary returned after executing thestate_dict()
method on a model is attached.Thanks!
state_dict_export.txt