Open CFAndy opened 10 years ago
I don't believe the OpenGL GEMM is fully complete, at least for non-float buffers like those in jetpac.ntwk. There is an alternative implementation using WebGL's slightly different shader language in the Javascript section.
force load buffer into fp32 and add a missed program->clearInputBuffers(); it can work. it can give same detection result as MKL on my HSW desktop. That's great.
Hi Pete, Is the opengl acceleration path in libjpcnn workable? I tried it and failed on following error message with jetpack.ntwk and dog.jpg. Can the issue be fixed? Calling gl_gemm_fixed() m=96, n=3000, inputK=363 Compiling 16-bit shader aDims=(96, 24, 1), bDims=(3000, 24, 1) aDims=(96, 24, 1), bDims=(3000, 24, 1) aDims=(96, 24, 1), bDims=(3000, 24, 1) aDims=(96, 24, 1), bDims=(3000, 24, 1) aDims=(96, 24, 1), bDims=(3000, 24, 1) aDims=(96, 24, 1), bDims=(3000, 24, 1) aDims=(96, 24, 1), bDims=(3000, 24, 1) aDims=(96, 24, 1), bDims=(3000, 24, 1) aDims=(96, 24, 1), bDims=(3000, 24, 1) aDims=(96, 24, 1), bDims=(3000, 24, 1) aDims=(96, 24, 1), bDims=(3000, 24, 1) aDims=(96, 24, 1), bDims=(3000, 24, 1) aDims=(96, 24, 1), bDims=(3000, 24, 1) aDims=(96, 24, 1), bDims=(3000, 24, 1) aDims=(96, 24, 1), bDims=(3000, 24, 1) aDims=(96, 3, 1), bDims=(3000, 3, 1) Assertion failed: (virtualWidth % elementsPerPixel) == 0, file c:\deepbeliefsdk-gh-pages\sour ce\src\lib\opengl\glgemm.cpp, line 1005 The HW is HSW integrated gfx. Thanks! -Andy