ROCm / MIOpen

AMD's Machine Intelligence Library
https://rocm.docs.amd.com/projects/MIOpen/en/latest/
Other
1.09k stars 230 forks source link

Forbid inefficient TensorDescriptor initialization #3393

Open CAHEK7 opened 1 week ago

CAHEK7 commented 1 week ago

Initializing TensorDescriptor from std::vector<int> is very inefficient due to extra checks and multiple intermediate vector, since internally std::vector<size_t> is used.

Changed all the initializations to the native size_t, removed constructors with std::vector<int> and added workarounds for a legacy descriptors initializations with int's.

It increased performance for the current RNN implementation for a few percents.

CAHEK7 commented 1 week ago

Majority of tests heavily rely on vector<int>