Closed wcshen closed 3 months ago
Our current model latency is 152.27 ms with the 64-dimension LION-Mamba model on the Waymo dataset, using an NVIDIA 3090 GPU. The inference speed of the linear RNN still needs optimization.
Our current model latency is 152.27 ms with the 64-dimension LION-Mamba model on the Waymo dataset, using an NVIDIA 3090 GPU. The inference speed of the linear RNN still needs optimization.