Today I test the NNPACK in ARMv8 machine, I found when I improve the thread number, the time increases.
I am very confused and I am no sure what is the problem, the script just like this:
My machine are numa arch, however I am sure the 32 threads run on same node so there is no numa remote access issue.
Please tell me how to improve the time ? Thanks!
Today I test the NNPACK in ARMv8 machine, I found when I improve the thread number, the time increases. I am very confused and I am no sure what is the problem, the script just like this:
My machine are numa arch, however I am sure the 32 threads run on same node so there is no numa remote access issue. Please tell me how to improve the time ? Thanks!