robertknight / rten

ONNX neural network inference engine
124 stars 9 forks source link

Optimize copying of non-contiguous tensors with 5+ dimensions #409

Closed robertknight closed 2 days ago

robertknight commented 2 days ago

Improve code path for tensors with 5+ dimensions in TensorBase::init_from. Instead of falling back to slow iteration via TensorBase::iter, iterate over inner views of 4 dims and use the faster code path that handles this.

This was encountered while testing https://huggingface.co/briaai/RMBG-2.0.