Closed swdee closed 2 months ago
On RK3588 using a lookup table to convert float 16 to float32 provides speed up on 35%.
BenchmarkF16toF32NormalConversion-8 150 7872802 ns/op 1720348 B/op 1 allocs/op BenchmarkF16toF32LookupConversion-8 218 5123550 ns/op 1720342 B/op 1 allocs/op
On Threadripper workstation, a 3.3x speed up.
BenchmarkF16toF32NormalConversion-20 1302 916041 ns/op 1720322 B/op 1 allocs/op BenchmarkF16toF32LookupConversion-20 3919 275437 ns/op 1720335 B/op 1 allocs/op
On RK3588 using a lookup table to convert float 16 to float32 provides speed up on 35%.
BenchmarkF16toF32NormalConversion-8 150 7872802 ns/op 1720348 B/op 1 allocs/op BenchmarkF16toF32LookupConversion-8 218 5123550 ns/op 1720342 B/op 1 allocs/op
On Threadripper workstation, a 3.3x speed up.
BenchmarkF16toF32NormalConversion-20 1302 916041 ns/op 1720322 B/op 1 allocs/op BenchmarkF16toF32LookupConversion-20 3919 275437 ns/op 1720335 B/op 1 allocs/op