hujinshui / bcslib

Automatically exported from code.google.com/p/bcslib
0 stars 0 forks source link

Fast matrix transposition #2

Open GoogleCodeExporter opened 8 years ago

GoogleCodeExporter commented 8 years ago
Replace the current naive implementation with a fast implementation.

- Specialize on fixed-size small matrices (up to 4 x 4)
- Use cache-oblivious block decomposition
- Should not affect the user-end interface (except for providing additional API 
for advanced users)

Original issue reported on code.google.com by linda...@gmail.com on 1 May 2012 at 2:43