etaler / Etaler

A flexable HTM (Hierarchical Temporal Memory) framework with full GPU support.
BSD 3-Clause "New" or "Revised" License
89 stars 14 forks source link

OpenCL backend producing different results from the CPU. #67

Closed marty1885 closed 4 years ago

marty1885 commented 5 years ago

As much as I tried to make sure OpenCL is doing the right thing in the unit tests, it seems that somewhere in the OpenCL code is wrong and generating the weird-ish results (but it still works),

marty1885 commented 5 years ago
int main()
{
    auto gpu = std::make_shared<OpenCLBackend>();
    SpatialPooler sp({128}, {32});
    SpatialPooler gpu_sp = sp.to(gpu.get());

    //Encode the value 0.1 into a 32 bit SDR
     Tensor x = encoder::scalar(0.1, 0, 1, 128, 12);

     std::cout << "CPU\n" << sp.compute(x) << std::endl;
     std::cout << "GPU\n" << gpu_sp.compute(x.to(gpu.get())) << std::endl;
}

prints

CPU
{ 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0}
GPU
{ 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0}

I guess something is wrong in the global inhibition code?

marty1885 commented 5 years ago

Hang on... There's more problem. I ran some of my research code on OpenCL but OpenCL is giving me weird results.

Further more, when OpenCL is enabled in the API notebook, TM generates a curiously stable pattern 圖片

marty1885 commented 5 years ago

There's also the occasional problem that SP w OpenCL generates weird results

OpenCL image

Looks like race condition. But I don't get why.g

marty1885 commented 5 years ago

I'm totally unsure weather it is a GPU/Nvidia bug or its in Etaler. I can't reliably trigger this behavior. But I'm sure this problem exists in the cellActivity function.

marty1885 commented 5 years ago

I have been trying for hours, the SP problem seems only happen on my GTX 780Ti. I can't replicate it on my GTX 1050 and Intel iGPU.

Frow now on, GTX 700 series and earlier cards are officially unsupported.

marty1885 commented 4 years ago

OpenCL on my Intel iGPU shows no sign of problem. Closing. Feel free to re-open the issue if you are seeing the behavior somewhere else.