ARM-software / ComputeLibrary

The Compute Library is a set of computer vision and machine learning functions optimised for both Arm CPUs and GPUs using SIMD technologies.
MIT License
2.83k stars 775 forks source link

How to change the NEON to CL and a error about CL configure #156

Closed ppplinday closed 7 years ago

ppplinday commented 7 years ago

HI:

I wrote a NEON program and everything is fine so that I want to change it to a CL program. I changed Tensor to CLTensor and changed NEConvolutionlayer to CLConvolutionlayer and so on. It that right?

And I have a issue. After I changed this, the program can build but cannot run. I checked and found some problem in configure. Because the configure function, my program cannot run. There are something confused me and you can see the picture.

I used ide and I saw this hint cl1 It show that the parameter is ICLTensor. This is fine. However, after that the parameter is unknown. cl2 I do not what`s wrong in this situation, and in NEON, everything is OK. I think the library is fine because it can build successfully. Any helps or hints are appreciate!!!

mpflanzer commented 7 years ago

Your overall approach sounds right. Can you please post the code that you try to run? Also the exact error message or console output would be helpful.

No idea about the problem with your IDE. Though I doubt that it is related to the actual problem when running the program.

ppplinday commented 7 years ago

Hi @mpflanzer This is my code:

#include "arm_compute/runtime/NEON/NEFunctions.h"
#include "arm_compute/runtime/CL/CLFunctions.h"

#include "arm_compute/core/Types.h"
#include "test_helpers/Utils.h"
#include <iostream>
#include <sstream>
#include <fstream>
#include <ostream>

using namespace arm_compute;
using namespace test_helpers;

static float StringToFloat(const std::string & str){
    std::istringstream iss(str);
    float number;
    iss >> number;
    return number;
}

void main1_neon_dnn(int argc, const char **argv)
{
    std::cout << "start!!!" << std::endl;

    unsigned int number5 = 5;
    unsigned int number3 = 3;
    unsigned int number64 = 64;
    TensorShape ishape(number5, number5, number3);
    TensorShape fcshape(number5, number5, number64);

    CLTensor itensor, fctensor;
    itensor.allocator() -> init(TensorInfo(ishape, 1, DataType::F32));
    fctensor.allocator() -> init(TensorInfo(fcshape, 1, DataType::F32));

    TensorShape weightsshape(number3, number3, number3, number64);
    TensorShape biasesshape(number64);
    CLTensor weights, biases;
    weights.allocator() -> init(TensorInfo(weightsshape, 1, DataType::F32));
    biases.allocator() -> init(TensorInfo(biasesshape, 1, DataType::F32));

    CLConvolutionLayer fc;
    std::cout << "here is fine!" << std::endl;
    fc.configure(&itensor, &weights, &biases, &fctensor, PadStrideInfo(1, 1, 1, 1));
    std::cout << "here is fine too!" << std::endl;

    itensor.allocator() -> allocate();
    fctensor.allocator() -> allocate();
    weights.allocator() -> allocate();
    biases.allocator() -> allocate();

    std::cout << "Fine!!!!!" << std::endl;
}
int main(int argc, const char **argv)
{
    return test_helpers::run_example(argc, argv, main1_neon_dnn);
}

It is a very easy example and it`s outcome should like this: start!!! here is fine! here is fine too! Fine!!!!!

However, when I run this program in my phone, the outcome just like this: start!!! here is fine!

Without any error. As a result, I think some wrong in my configure. And everything is fine if I wrote it as NEON.

ppplinday commented 7 years ago

Did you have same issue like this?

AnthonyBarbier commented 7 years ago

Could you please use a build with debug=1 or asserts=1 in order to have a more detailed error message?

ppplinday commented 7 years ago

@AnthonyARM I already used debug=1 and asserts = 1, it still could build successful but have no any error message. I have no idea. Could you try to run my code in your computer? Thank you very much!

mpflanzer commented 7 years ago

Have you been able to run any of the CL examples we provide?

ppplinday commented 7 years ago

@mpflanzer I run it and it is fine. That is confused me.

ppplinday commented 7 years ago

THe CPU/GPU of my android phone is Qualcomm 625. Hope this can help.

AnthonyBarbier commented 7 years ago

That's probably where the issue comes from then: maybe your Qualcomm GPU doesn't support cl_arm_non_uniform_work_group_size

GeorgeARM commented 7 years ago

@ppplinday check if your GPU supports OpenCL 2.0. If it does, pass "-cl-std=CL2.0" as a compile flag to the kernels, this should enable support for non-uniform work-groups.

ppplinday commented 7 years ago

@AnthonyARM @GeorgeARM Thank for you help and phone`s GPU supports OpenCL2.0 but I do not know how to pass "-cl-std=CL2.0" as a compile flag to the kernels. Could you tell me how to do it? Thank you very much!!!!

ppplinday commented 7 years ago

Do you run my code successful in your phone?

AnthonyBarbier commented 7 years ago

Try to replace -cl-arm-non-uniform-work-group-size here

ppplinday commented 7 years ago

@AnthonyARM I try it and it cannot work. Do you try to run my code? Perhaps there are some bugs in this library.

mpflanzer commented 7 years ago

You need to call CLScheduler::get().default_init() at the beginning. That will set-up the the CL runtime.

ppplinday commented 7 years ago

@AnthonyARM @mpflanzer @GeorgeARM Thank your very much and it work final!!!! Finally, I want to ask a more question. When I run my another program with CL, it has a error. The terminal shows that

terminate called after throwing an instance of 'cl::Error'
  what():  clCreateBuffer

And I am very sure the problem is conv_1_2.configure(I set some output and there are not any output after conv_1_2 and it has a input before conv_1_2)

conv_1_1.configure(&input, &weights_1_1, &biases_1_1, &out_1_1, PadStrideInfo(1, 1, 1, 1));
Nact_1_1.configure(&out_1_1, &act_1_1, ActivationLayerInfo(ActivationLayerInfo::ActivationFunction::RELU));
conv_1_2.configure(&act_1_1, &weights_1_2, &biases_1_2, &out_1_2, PadStrideInfo(1, 1, 1, 1));
Nact_1_2.configure(&out_1_2, &act_1_2, ActivationLayerInfo(ActivationLayerInfo::ActivationFunction::RELU));

Any helps or hints are appreciate!!! And thanks for your patience!!!!!