explosion / thinc

🔮 A refreshing functional take on deep learning, compatible with your favorite libraries
https://thinc.ai
MIT License
2.82k stars 275 forks source link

Hash function not working for Windows GPU #128

Closed svlandeg closed 4 years ago

svlandeg commented 4 years ago

Triggered by reported GPU regressions on Windows for spaCy, (see https://github.com/explosion/spaCy/issues/4816 and https://github.com/explosion/spaCy/issues/4758), I did some detailed analyses with thinc.

When I run the test suite on Windows with a GPU (CUDA 10.1) on either the latest pip release (7.3.1) or the current version from master (7.4.0.dev0), the hashing test test_hash_gives_distinct_keys[CupyOps] fails on assert array(0, dtype=uint32) != 0. It doesn't fail on Linux according to @adrianeboyd. On CPU (Windows), this test also just works and takes ids [0 0 0 0 0] and turns it into keys, e.g.

[[3428303819  685728695 3303603794 4065689084]
 [3428303819  685728695 3303603794 4065689084]
 [3428303819  685728695 3303603794 4065689084]
 [3428303819  685728695 3303603794 4065689084]
 [3428303819  685728695 3303603794 4065689084]]

but on GPU the generated keys are just

 [0 0 0 0]
 [0 0 0 0]
 [0 0 0 0]
 [0 0 0 0]]

This seems to be caused by https://github.com/explosion/thinc/blob/master/thinc/neural/_murmur3.cu and could be related to the changes from PR https://github.com/explosion/thinc/pull/117. I tried running with the version from thinc_gpu_ops (https://github.com/explosion/thinc_gpu_ops/blob/master/include/_murmur3.cu), but that gave the same results.

Then I wanted to test with the original code from https://github.com/aappleby/smhasher/blob/master/src/MurmurHash3.cpp which has a section for Microsoft Visual Studio support. But I couldn't get that to work because it needs to #include <stdlib.h> to get the implementation for _rotl64.

So now I'm wondering whether the GPU support for Windows ever did fully work, if that Windows-specific part was always missing from the Murmurhash implementation ? I want to point out that the regressions I got on Windows GPU, are when parsing with en_core_web_sm, but things seem to work fine with en_core_web_md. So maybe this specific case hadn't been tested before ?

@honnibal : you mention that

Including stdint.h is a pain in cupy, so just put the declarations in

But can it be done? So that we can also include stdlib.h for Windows ?

honnibal commented 4 years ago

Hmm! I've looked at the file and it's hard to see what could be happening that's Windows-specific. Odd.

For the stdint thing, I would just remove the include and put in the various type-declarations needed.

svlandeg commented 4 years ago

I think the main Windows-specific bit is the ROTL64 implementation. I tried all other "Windows-specific" code yesterday but nothing seemed to make a difference. But I couldn't get the actual _rotl64 from Visual Studio because I can't include the stdlib.h...

For the stdint thing, I would just remove the include and put in the various type-declarations needed.

Yep, that's what I did and it works. But I also tried making it work with the include just to see whether I could get include to work, if that makes sense :|

svlandeg commented 4 years ago

Hmm! I've looked at the file and it's hard to see what could be happening that's Windows-specific. Odd.

Finally found it: the int64_t and uint64_t are platform-specific :-)

Fixed in PR #149 :-)