Open Aixile opened 6 years ago
Sorry for late reply. I will investigate it.
This bug also occur in train_mnist_chainer.py with constant_encoder_name="eightbit"
. (non-constant) variable offset in webgl is "-1", and it causes error in the encoder. Continuing to debug.
https://github.com/mil-tokyo/webdnn/blob/e6ab747b13d8ed6f9da2e78385bce10812f2be28/src/graph_transpiler/webdnn/encoder/constant_encoder_eightbit.py#L66
It broken in commit 56113b24.
From this commit, train_mnist_chainer.py with constant_encoder_name="eightbit"
on generate_descriptor
raises error in webgl backend.
There seems to be three different bugs! I solved one, and found workaround for another one.
Problems:
constant_encoder_name="eightbit"
On WebGL, size of texture and original variable differs because texture have to be rectangle. Texture size is calculated by height * width, and they must be integer. Therefore, rounding up is applied for texture size, which makes texture size > original size. However, it is not considered in constant_encoder_eightbit.py
.
Also, classification of constant and variable was wrong.I put temporary fix to fix-816 branch (a686df1ec), so please try it to avoid this problem.
weight_webgl_4096.bin
) is unnaturally small.$ ls -l models/resnet256
total 1206976
-rw-r--r-- 1 hidaka staff 37301 4 30 21:14 graph_webassembly.json
-rw-r--r-- 1 hidaka staff 3476587 4 30 21:14 graph_webgl_16384.json
-rw-r--r-- 1 hidaka staff 6124614 4 30 21:14 graph_webgl_4096.json
-rw-r--r-- 1 hidaka staff 4214513 4 30 21:14 graph_webgl_8192.json
-rw-r--r-- 1 hidaka staff 296498 4 30 21:02 graph_webgpu.json
-rw-r--r-- 1 hidaka wheel 106503 4 30 21:14 kernels_asmjs.js
-rw-r--r-- 1 hidaka staff 9748 4 30 21:14 kernels_asmjs.js.mem
-rw-r--r-- 1 hidaka staff 51407 4 30 21:14 kernels_webassembly.cpp
-rw-r--r-- 1 hidaka wheel 24125 4 30 21:14 kernels_webassembly.js
-rw-r--r-- 1 hidaka staff 56040 4 30 21:14 kernels_webassembly.wasm
-rw-r--r-- 1 hidaka staff 65574 4 30 21:02 kernels_webgpu.metal
-rw-r--r-- 1 hidaka staff 184662028 4 30 21:14 weight_webassembly.bin
-rw-r--r-- 1 hidaka staff 184662028 4 30 21:14 weight_webgl_16384.bin
-rw-r--r-- 1 hidaka staff 14792716 4 30 21:14 weight_webgl_4096.bin
-rw-r--r-- 1 hidaka staff 33667084 4 30 21:14 weight_webgl_8192.bin
-rw-r--r-- 1 hidaka staff 184662028 4 30 21:02 weight_webgpu.bin
I found that graph descriptor for size 16384 works correctly. Currently, all devices loads size 4096, so the workaround is
cp weight_webgl_16384.bin weight_webgl_4096.bin
cp graph_webgl_16384.json graph_webgl_4096.json
Of course, it does not work devices which does not support texture size 16384.
By these two workarounds, I managed to WebGL + 8bit compression model to work on Chrome.
I started to track these two problems in #820 and #821.
@milhidaka I re-implement your patch in e06f903, with some extra comments. Please review it.
Codes and the model for reproducing can be found here, I am using webdnn with commit
f403a30da36b6741bc857c21c3ca1e65af8fbac9
For model conversion, please use
python convert_webdnn.py --chainer_model_path SmoothedGenerator_40000.npz --out models/resnet256
Also, there is a web interface in
webcode/webdnn
.When I try to convert to WebGL with 8bit compression, I got
Expected:
Got:
Safari 11.0.3
This repo also contains a speed comparsion with tensorflow.js, webdnn with webgl is 1.5~2x faster than tfjs on my computer, except it gives a wrong anwser.