ARMNN backend options usage

abhajaswal commented 1 year ago

I am using armnn 21.02
I am trying to set the CreationOptions as below , i want to save the clcache.bin file

armnn::IRuntime::CreationOptions creation_options; std::string const filePathString = "/usr/share/dann/armnn_clcahae.bin"; creation_options.m_BackendOptions.emplace_back( armnn::BackendOptions { "GpuAcc", { { "FastMathEnabled", true}, {"SaveCachedNetwork", true}, {"CachedNetworkFilePath", filePathString} } } );

When i try to pass this to armnn , the options SaveCachedNetwork doesnt work, but when i hardcoded it inside the armnn code

ClBackendModelContext::ClBackendModelContext(const ModelOptions& modelOptions) : m_CachedNetworkFilePath("/usr/share/dann/armnn_clcahae.bin"), m_IsFastMathEnabled(true), m_SaveCachedNetwork(true)

in src/backends/cl/ClBackendModelContext.cpp it savedthe cache.bin

Did i do any incorrect usage in creation_options.m_BackendOptions.?

catcor01 commented 1 year ago

Hello @abhajaswal,

It is difficult to know for sure from the small snippet of code you have provided why SaveCachedNetwork did not work. How you have created the 'CreationOptions' looks to be correct. However, without seeing the full code, I suspect how you passed CreationOptions to ArmNN may be the problem. I would refer to this simple sample as a guideline on how to handle CreationOptions. Specifically, the following lines where CreationOptions is passed into IRuntime::Create():

// Create ArmNN runtime
IRuntime::CreationOptions options; // default options
IRuntimePtr run = IRuntime::Create(options);

Also, if there is no specific reason you are using ArmNN v21.02 I would recommend updating to ArmNN latest v22.05.

Kind Regards, Cathal.

abhajaswal commented 1 year ago

I use

std::vector mAccelType; armnn::IRuntime::CreationOptions creation_options; std::string const filePathString = "/usr/share/dann/armnn_clcahae.bin"; creation_options.m_BackendOptions.emplace_back( armnn::BackendOptions { "GpuAcc", { { "FastMathEnabled", true}, {"SaveCachedNetwork", true}, {"CachedNetworkFilePath", filePathString} } } );

armnn::IRuntime *InferenceARMNN::sRuntime(nullptr);

sRuntime = armnn::IRuntime::CreateRaw(creation_options);

ret = CreateNetwork(model_paths, model_format);

mAccelType.push_back(armnn::Compute::CpuRef); bool graph_debug = false;

armnn::OptimizerOptions optimizerOptions(true, graph_debug);

/ Optimize the network for a specific runtime compute device, e.g. CpuAcc, GpuAcc armnn::IOptimizedNetworkPtr optimizedNet = armnn::Optimize( *mNetwork, mAccelType, sRuntime->GetDeviceSpec(), optimizerOptions);

Does optimizerOptions override the createoptions ?

As i saw that with this path only this constructor for ClBackendModelContex is called where i did some hard coding and the code worked :

ClBackendModelContext::ClBackendModelContext(const ModelOptions& modelOptions) : m_CachedNetworkFilePath("/usr/share/dann/armnn_clcahae.bin"), m_IsFastMathEnabled(true), m_SaveCachedNetwork(false) { printf("Before check case ABHA - %s %s %d modelOptions.empty() \n", FILE,func, modelOptions.empty()); if (!modelOptions.empty()) { ParseOptions(modelOptions, "GpuAcc", [&](std::string name, const BackendOptions::Var& value) {

       if (name == "FastMathEnabled")
       {
            printf("ABHA - %s %s %d FastMathEnabled  \n", __FILE__,__func__,__LINE__);
           m_IsFastMathEnabled |= ParseBool(value, false);
                            std::cout << m_IsFastMathEnabled  << std::endl;
       }
       if (name == "SaveCachedNetwork")
       {
           printf("ABHA - %s %s %d SaveCachedNetwork \n", __FILE__,__func__,__LINE__);
           m_SaveCachedNetwork |= ParseBool(value, false);
           std::cout << m_SaveCachedNetwork   << std::endl;
       }
       if (name == "CachedNetworkFilePath")
       {
            printf("ABHA - %s %s %d CachedNetworkFilePath  \n", __FILE__,__func__,__LINE__);

           m_CachedNetworkFilePath = ParseFile(value, "");
           std::cout << m_CachedNetworkFilePath  << std::endl;

       }
   });

} else{ printf("Else case ABHA - %s %s %d modelOptions.empty() \n", FILE,func, modelOptions.empty()); } }

abhajaswal commented 1 year ago

I had put few logs and code flow looked like this :

ClBackendContext called 0 /home/abuild/rpmbuild/BUILD/armnn-21.02/include/armnn/BackendOptions.hpp ParseOptions 292 FastMathEnabled /home/abuild/rpmbuild/BUILD/armnn-21.02/include/armnn/BackendOptions.hpp ParseOptions 292 SaveCachedNetwork /home/abuild/rpmbuild/BUILD/armnn-21.02/include/armnn/BackendOptions.hpp ParseOptions 292 CachedNetworkFilePath Info: Initialization time: 10.61 ms

Before check case ABHA - /home/abuild/rpmbuild/BUILD/armnn-21.02/src/backends/cl/ClBackendModelContext.cpp ClBackendModelContext 1 modelOptions.empty() Else case ABHA - /home/abuild/rpmbuild/BUILD/armnn-21.02/src/backends/cl/ClBackendModelContext.cpp ClBackendModelContext 1 modelOptions.empty() Warning: WARNING: Layer of type DetectionPostProcess is not supported on requested backend GpuAcc for input data type Float16 and output data type Float32 (reason: IsDetectionPostProcessSupported is not implemented [/home/abuild/rpmbuild/BUILD/armnn-21.02/src/backends/backendsCommon/LayerSupportBase.cpp:195]), falling back to the next backend. Warning: WARNING: Layer of type DetectionPostProcess is not supported on requested backend GpuAcc for input data type Float16 and output data type Float32 (reason: IsDetectionPostProcessSupported is not implemented [/home/abuild/rpmbuild/BUILD/armnn-21.02/src/backends/backendsCommon/LayerSupportBase.cpp:195]), falling back to the next backend. Before check case ABHA - /home/abuild/rpmbuild/BUILD/armnn-21.02/src/backends/cl/ClBackendModelContext.cpp ClBackendModelContext 1 modelOptions.empty() Else case ABHA - /home/abuild/rpmbuild/BUILD/armnn-21.02/src/backends/cl/ClBackendModelContext.cpp ClBackendModelContext 1 modelOptions.empty() Before check case ABHA - /home/abuild/rpmbuild/BUILD/armnn-21.02/src/backends/cl/ClBackendModelContext.cpp ClBackendModelContext 1 modelOptions.empty() Else case ABHA - /home/abuild/rpmbuild/BUILD/armnn-21.02/src/backends/cl/ClBackendModelContext.cpp ClBackendModelContext 1 modelOptions.empty() ABHA /home/abuild/rpmbuild/BUILD/armnn-21.02/src/backends/cl/ClWorkloadFactory.cpp InitializeCLCompileContext 129 ABHA /home/abuild/rpmbuild/BUILD/armnn-21.02/src/backends/cl/ClWorkloadFactory.cpp AfterWorkloadsCreated 66 ABHA /home/abuild/rpmbuild/BUILD/armnn-21.02/src/backends/cl/ClWorkloadFactory.cpp 70 check modelOptions->SaveCachedNetwork() 0

abhajaswal commented 1 year ago

I confirm by using optimizer options i am able to save and load the CL files

std::vectorarmnn::BackendId mAccelType;
armnn::IRuntime::CreationOptions creation_options;
std::string const filePathString = "/usr/share/dann/armnn_clcahae.bin";
creation_options.m_BackendOptions.emplace_back(
armnn::BackendOptions
{
"GpuAcc",
{
{ "FastMathEnabled", true},
{"SaveCachedNetwork", true},
{"CachedNetworkFilePath", filePathString}
}
}
);

armnn::IRuntime *InferenceARMNN::sRuntime(nullptr);

sRuntime = armnn::IRuntime::CreateRaw(creation_options);

ret = CreateNetwork(model_paths, model_format);

mAccelType.push_back(armnn::Compute::CpuRef);
bool graph_debug = false;

armnn::OptimizerOptions optimizerOptions(true, graph_debug);

armnn::BackendOptions gpuAcc("GpuAcc",
                       {
                                { "FastMathEnabled", true },
                                { "SaveCachedNetwork", savefile },
                               { "CachedNetworkFilePath", clcache_file_path }
                       });
                        // enable the GPU  specific CLcache save and load option
                       optimizerOptions.m_ModelOptions.push_back(gpuAcc);

armnn::IOptimizedNetworkPtr optimizedNet = armnn::Optimize(
*mNetwork, mAccelType, sRuntime->GetDeviceSpec(), optimizerOptions);

If this is the way the options need to be set , then what is use of armnn::IRuntime::CreationOptions creation_options; then??

abhajaswal commented 1 year ago

Apart from this can you help with the difference between CLtuner file and files saved via SaveCachedNetwork ?

keidav01 commented 1 year ago

Hey @abhajaswal

In short

SaveCachedNetwork helps to remove the compilation stage of CL kernels in the network, thus removing some initialization time
CLTuner file helps optimize CL performance based on the network you specify, which should benefit every inference

CLTuner Tuning GPU in the network from scratch can be long and affect considerably the execution time for the first run of your network. So if you wish to tune the network, you can specify your within BackendOptions, the level of your tuning (Rapid (1), Normal (2), Exhaustive (3)) and the path you want the file to be outputted to.

When you want to run the network again with your CL tuning file, just specify the file and not the tuning level.

SaveCachedNetwork: This will save the network after the initial compilation of OpenCL kernels. Using then CachedNetworkFilePath will remove initial compilation of OpenCL kernels and speed up the first execution of the network. This does not optimize parameters of CL, but just removes a stage within the initialization flow.

Further information:

The OpenCL tuner, a.k.a. CLTuner, is a module of Arm Compute Library that can improve the performance of the OpenCL kernels tuning the Local-Workgroup-Size (LWS). The optimal LWS for each unique OpenCL kernel configuration is stored in a table. This table can be either imported or exported from/to a file.

The OpenCL tuner performs a brute-force approach. It is recommended to keep the GPU frequency constant and disable the power management during the whole duration of this process for not incurring in a wrong tuning.

If you wish to know more about LWS and the important role on improving the GPU cache utilization, we suggest having a look at the presentation "Even Faster CNNs: Exploring the New Class of Winograd Algorithmsavailable at the following link:https://www.embedded-vision.com/platinum-members/arm/embedded-vision-training/videos/pages/may-2018-embedded-vision-summit-iodice

In terms of SaveCachedNetwork, ACL used to have a single cache for everything which made the update of specific models very difficult/time consuming. ACL have now decoupled this mechanism and added the ability for the user to specify the cache to use, thus different caches can be used for different models. ArmNN exposes this mechanism and allow the user to pass the cache to be used.

I hope this helps, I will close this ticket on Monday 10th October unless you need anything else in relation to this. Thank you!

ARM-software / armnn

ARMNN backend options usage #669