ermig1979 / Synet

A small framework to infer neural network
MIT License
137 stars 26 forks source link

Win32 VS x64 performance #14

Closed edward9112 closed 4 years ago

edward9112 commented 4 years ago

After making some testing I found that the performance of the x64 build is substantially better than win32 build (about 50% difference). The input was a standard, 4x down scaled 720p video.

Is this an expected behavior? Any ways to improve the win32 performance?

ermig1979 commented 4 years ago

There are only 8 vector (SSE, AVX, AVX-512) registers are available in In 32-bit mode instead of 16 (32 for AVX-512) for 64 bit. 8 registers are not enough to utilize all performance of modern CPU.

edward9112 commented 4 years ago

There are only 8 vector (SSE, AVX, AVX-512) registers are available in In 32-bit mode instead of 16 (32 for AVX-512) for 64 bit. 8 registers are not enough to utilize all performance of modern CPU.

Got it. Thank you for explanation!