Closed Realtyxxx closed 1 year ago
Hi @Realtyxxx
This is a partially implemented feature which is not currently working.
We are going to look into removing it.
Hope this helps.
Seems it is cool feature. I used it to add a sve kernel for floor op. So although it is not finished ,It should work? 😂
Hi @Realtyxxx
Good to you you found it useful.
What I meant is that this is not really used in the library code as shown below:
user@host:~/work/acl/ComputeLibrary$ rgrep Preferred src/ ./arm_compute
src/cpu/ICpuKernel.h: Preferred, /**< Retrieve the best implementation available for the given Cpu ISA, ignoring the build flags */
src/cpu/ICpuKernel.h: if(uk.is_selected(selector) && (selection_type == KernelSelectionType::Preferred || uk.ukernel != nullptr))
This is a mechanism to test the kernel selection logic as it's only used in the cpu validation tests:
user@host:~/work/acl/ComputeLibrary$ rgrep Preferred tests/
tests/validation/NEON/ActivationLayer.cpp: const auto *selected_impl = CpuActivationKernel::get_implementation(ActivationDataTypeISASelectorData{data_type, cpu_isa,ActivationLayerInfo::ActivationFunction::BOUNDED_RELU}, cpu::KernelSelectionType::Preferred);
tests/validation/NEON/Floor.cpp: const auto *selected_impl = CpuFloorKernel::get_implementation(DataTypeISASelectorData{data_type, cpu_isa}, cpu::KernelSelectionType::Preferred);
tests/validation/NEON/GEMM.cpp: const auto *selected_impl_mul = CpuGemmMatrixMultiplyKernel::get_implementation(DataTypeISASelectorData{ data_type, cpu_isa }, cpu::KernelSelectionType::Preferred);
tests/validation/NEON/GEMM.cpp: const auto *selected_impl_add = CpuGemmMatrixAdditionKernel::get_implementation(DataTypeISASelectorData{ data_type, cpu_isa }, cpu::KernelSelectionType::Preferred);
tests/validation/NEON/DepthwiseConvolutionLayerNative.cpp: cpu::KernelSelectionType::Preferred );
Hope this helps.
Thank you !!! @morgolock
This totally helps a lot , before yesterday , I thought the kernel implementation is only exits in src/cpu/kernel , But now I found that the implementation is exits in core/NEON/kernels, So I may changes my thought about the kernel implementation of acl,.
I found the find_implementation
is the func to do that , am I right ?
So I have another question, what the difference between the implementation in core/NEON/kernels with the implementation in src/cpu/kernels
By the way, Is there any documents describes how to add an optimized kernel in acl, or any docs to help underestanding the acl‘s call process, I'm trying to understand the works in oneDNN with acl.
Thank you very much again!!, with so much honour to have your help
Hi @Realtyxxx
So I have another question, what the difference between the implementation in core/NEON/kernels with the implementation in
Originally in ACL all the neon kernels were in src/core/NEON/kernels
. As part of an improvement of the API most of the kernels were ported to src/cpu/kernels
. Similarly most of the old functions in src/runtime/NEON/functions/
call the new ones in src/cpu/operators/
Is there any documents describes how to add an optimized kernel in acl, or any docs to help underestanding the acl‘s call process
We have doucumentation about how to add a new operator, please refer to https://arm-software.github.io/ComputeLibrary/latest/adding_operator.xhtml
Just adding a new kernel to an existing operator is much simpler than creating a new operator. Please take a look at this patch https://review.mlplatform.org/c/ml/ComputeLibrary/+/7729 adding new SVE kernels for the hardwish actitation.
Let's know if you have any questions.
Hope this helps.
Just adding a new kernel to an existing operator is much simpler than creating a new operator. Please take a look at this patch https://review.mlplatform.org/c/ml/ComputeLibrary/+/7729 adding new SVE kernels for the hardwish actitation.,
This really could do a lot help. I think my problem is fixed! honour to have you help
Forget one thing, I have done a little job that I add the sve kernel of floor, Known that it is easy job and also it is a "learning code" So I'd like to know 2 things :
Hi @Realtyxxx
how can I test the accuracy of my kernel via ACL
You can use arm_compute_validation
to test the correctness of your new kernel. This program includes all the tests we use to check for correctness. The tests you need to run are in https://github.com/ARM-software/ComputeLibrary/blob/main/tests/validation/NEON/Floor.cpp . You must build the library with SVE enabled and validation_tests=1 test_filter=Floor.cpp
. Then you just upload the binary arm_compute_validation
to the device and execute it. Your device must support sve and you must make sure that NEFloor::get_implementation()
is actually selecting your new kernel.
Could I pull a request merge ?
We would be delighted to take your patch contributing to ACL. Please note that we do not accept pull requests on github because the development happens somewhere else and on github we just publish snapshots of the development repository for each release.
For more information on how to submit a patch please read https://arm-software.github.io/ComputeLibrary/latest/contribution_guidelines.xhtml#S5_2_how_to_submit_a_patch
You need to submit your patch on https://review.mlplatform.org/ml/ComputeLibrary , the patch will be reviewed by the team and merged to the main branch.
Let us know if you have any questions or require help.
Thank you very much!!!@morgolock!!!, I 'll let you know if I have any other questions!
https://github.com/ARM-software/ComputeLibrary/blob/aabef6c0584f06f4c0f4b61fb787d80374240619/src/cpu/ICpuKernel.h#L59