Add Unit Tests - Comprehensive

waylonflinn commented 8 years ago

More extensive set of unit tests, with a larger upper bound on runtime (< 5 min)

large matrices (> 1024)
larger range of numbers (0.0000001 - 1000000.0)

rreusser commented 8 years ago

This is great. I know it's just getting off the ground, but regarding range of numbers, is double precision in the realm of possibility? I know it's fraught with caveats and platform-dependence, but just curious—is it possible? 7/10 tests succeed for me, so just curious if that's a matter of precision or size limits or relation to powers of two or otherwise.

waylonflinn commented 8 years ago

@rreusser thanks! I haven't found a way to do double precision yet, but I'm open to suggestions and PRs. Hopefully something that makes this easy will make it into the next version of WebGL.

Regarding the existing unit tests, it's hard to tell from the output right now when precision is the problem. The best way to test that, at the moment, is adding the RTOL parameter to allclose: https://github.com/waylonflinn/weblas/blob/master/test/gemmfloatcalculator.js#L56

It defaults to 1e-05. I'd try 1e-04 then 1e-03 to see if those values make a difference. Hopefully I can turn allclose into a real assertion function soon, so pinpointing things will be easier.

waylonflinn commented 8 years ago

@rreusser I've added a notice about lack of 32-bit precision support to the unit tests, when the hardware doesn't support it. I also created an issue for emulating higher precision where needed: #8

Looks like it should be possible to get 64-bit precision (even on hardware that only supports 16-bit!). Check it out and give it a :+1: , if it looks good to you.

waylonflinn commented 8 years ago

@rreusser I made some updates to the unit tests, to give more information about failures. Would you be willing to pull the latest and post your results here?

Thanks again for taking the time to check out my work! It's nice to know there are other people out there who might be interested. :smile: :tada:

rreusser commented 8 years ago

Ah, that's very helpful debug output! Yeah, it definitely looks like the limits of precision. All of the failed tests have about seven digits of precision, which is about what you'd expect from single precision. Computing the roundoff error probably comes down to something akin to num_ops * require('almost-equal').FLT_EPSILON, but not sure that route is worth the trouble. I mean if it has six digits of precision, it's unlikely something went horribly wrong…

My results:

TAP version 13
# 64x64 times 64x64
ok 1 should be allclose
# 256x256 times 256x256
ok 2 should be allclose
# 512x512 times 512x512
ok 3 should be allclose
# 1024x1024 times 1024x1024
ok 4 should be allclose
# 512x1024 times 1024x512
ok 5 should be allclose
# 247x513 times 513x831
ok 6 should be allclose
# 981x513 times 513x652
not ok 7 should be allclose at 609395
  ---
    operator: allclose
    expected: |-
      '[..., 128.99998474121094, 128.34950256347656, 132.12049865722656, 128.6906280517578, ...]'
    actual: |-
      '[..., 128.00001525878906, 128.3494873046875, 132.1204071044922, 128.690673828125, ...]'
    at: Test.assert [as _assert] (http://localhost:54056/__testling?show=true:14005:17)
  ...
# 962x513 times 513x218
ok 8 should be allclose
# 432x513 times 513x363
not ok 9 should be allclose at 152408
  ---
    operator: allclose
    expected: |-
      '[..., 128.99998474121094, 125.43488311767578, 131.1151885986328, 132.3745880126953, ...]'
    actual: |-
      '[..., 127.99996185302734, 125.43492126464844, 131.1151580810547, 132.3745574951172, ...]'
    at: Test.assert [as _assert] (http://localhost:54056/__testling?show=true:14005:17)
  ...
# 1024x513 times 513x1024
not ok 10 should be allclose at 896917
  ---
    operator: allclose
    expected: |-
      '[..., 128.99998474121094, 127.80162811279297, 128.7520751953125, 123.52313232421875, ...]'
    actual: |-
      '[..., 127.99992370605469, 127.80169677734375, 128.75204467773438, 123.52313995361328, ...]'
    at: Test.assert [as _assert] (http://localhost:54056/__testling?show=true:14005:17)
  ...
# 1567x513 times 513x522
ok 11 should be allclose

1..11
# tests 11
# pass  8
# fail  3

And an additional console warning for every test:

WebGL: drawElements: texture bound to texture unit 2 is not renderable. It maybe non-power-of-2 and have incompatible texture filtering or is not 'texture complete'. Or the texture is Float or Half Float type with linear filtering while OES_float_linear or OES_half_float_linear extension is not enabled.
__testling?show=true:109 ok 11 should be allclose

rreusser commented 8 years ago

(And yes, was curious whether this could be relatively simply plugged into ndarray. I still need to sit down and think about whether the blas strides are complete enough to permit a generic ndarray.)

waylonflinn commented 8 years ago

Thanks for the test output!

I've been able to reproduce this bug in Chrome on OSX. It looks like most numbers are getting 5-7 digits of precision (as expected), but numbers near 129 are getting only 2-3. (128.99998474121094 vs 128.00001525878906 in the first failed test). The statistical structure of the test data makes this more likely to happen when the shared matrix dimension in the multiply is near 512. Generating the data again should give you a slightly different set of failed tests (but likely, non-empty).

There's a special bit of code that encodes numbers into bytes on the GPU, for shipping back into javascript. This feels like a bug in that code. I'll work on that today, hopefully have a fix by tomorrow.

Thanks again for the test output! :+1:

waylonflinn commented 8 years ago

@rreusser Bug is fixed in my test environment. Can you run the tests with the latest master and let me know what happens?

Also, I'm happy to help integrate this with the scijs ndarray. Then I don't have to build a tensor library myself. :smile:

waylonflinn commented 8 years ago

More tests are almost always better.

1..369
# tests 369
# pass  369

# ok

This is enough for now. :sweat_smile:

waylonflinn / weblas

Add Unit Tests - Comprehensive #4