jatinchowdhury18 / RTNeural

Real-time neural network inferencing
BSD 3-Clause "New" or "Revised" License
543 stars 57 forks source link

Residual Connection Supported? #126

Closed yongyizang closed 6 months ago

yongyizang commented 6 months ago

Hi! I want to ask if I defined a PyTorch model that contains a residual conection, such as:

residual = input
input, hidden_states = self.gru(input, hidden_states)
out = self.dense(input) + residual

Is this supported directly by RTNeural? Thanks.

jatinchowdhury18 commented 6 months ago

Hello! I guess the answer depends on what you mean by "directly".

If you're hoping to export a JSON configuration for your model, and have RTNeural automatically load the configuration, and run inference, that won't work because RTNeural only supports automatic model loading for purely sequential models.

That said, implementing a model similar to the example that you've provided is pretty straightforward in RTNeural. Assuming your model is 1-in/1-out, then you could do something like this:

struct ResidualGRUModel
{
    static constexpr int hidden_size = 16;
    RTNeural::ModelT<1, 1, RTNeural::GRULayerT<float, 1, hidden_size>, RTNeural::DenseT<float, hidden_size, 1>> model;

    void load_model (const nlohmann::json& state_dict) // "state_dict" JSON object export from PyTorch
    {
        RTNeural::torch_helpers::loadGRU<float> (state_dict, "name of gru layer in PyTorch", model.get<0>());
        RTNeural::torch_helpers::loadDense<float> (state_dict, "name of dense layer in PyTorch", model.get<1>());
    }

    float process_sample (float x)
    {
        return model.forward (&x) + x;
    }
};

If your model has a different number of input/output dimensions you may need to modify this approach a little bit, but the general idea should be the same.

yongyizang commented 6 months ago

Sure! Thanks for your help.

If the architecture gets more complex, is there any way for me to access the intermediate output from a given layer within the model, then run input starting from another position within the model? If this is not possible, can you provide some sort of estimate on the added overhead for me to initialize multiple RTNeural models in order to realize this?

Thanks again!

jatinchowdhury18 commented 6 months ago

Sure thing!

If you're using RTNeural's compile-time API (recommended for most use-cases), then each "layer" object has a field called outs which contains the output data for that layer. Please note that the data type of the outs field might be different depending on the backend being used and the size of the layer, so writing generic code using that field will probably take a bit of finesse.

If you're using the run-time API, then the layer input/output data is stored in the Model object, in a private member variable. If you need access to this data, we could probably figure out a safe way to expose the data for users of the Model class to access.

Initializing multiple RTNeural models could work as well, and might help to keep things a bit more organized in some cases. The way I would think about it is that the overhead of using a model is that we need a way to get data in and out of the model. This usually requires some data to be copied into or out of the model's internal data fields. So if you're running 3 or 4 models separately, then you might be copying some data around in multiple places, where you could otherwise use the data "in-place". If the amount of data being copied is very small, you likely won't be able to measure any performance difference, but the more data being copied, the more of a performance hit you're likely to take.