Maratyszcza / NNPACK

Acceleration package for neural networks on multi-core CPUs
BSD 2-Clause "Simplified" License
1.67k stars 317 forks source link

How to use nnp_convolution in latest version? #75

Closed Igamegum closed 7 years ago

Igamegum commented 7 years ago

Hmm , It is glad to see that NNPack increasing much commit,it seem will performance better,but I notice that some functions have changed ,and I dnot know the meaning of these params,May be I can get documents some where?

I try to use nnp_convolution_inference ,but I can not get the right result. I believe my use way is wrong,so I want to know how to use the new params ,what do these parameters mean

  1. transform_strategy
  2. output_subsampling
  3. workespace by the way ,why the function has no stride_size param,does nnp_convolution_inference only support stride one?

However , I Hope your help , thanks !

Maratyszcza commented 7 years ago

Output subsampling is what is called convolution stride in most deep learning frameworks. Note that nnp_convolution_inference supports output subsampling only with implicit GEMM algorithm (but you can specify auto algorithm too).

Maratyszcza commented 7 years ago

workspace arguments let you pre-allocate scratch buffers used by convolution implementation to avoid the overhead of allocating/deallocating these buffers inside the nnp_convolution_inference function. This helps improves performance, especially on small convolutions. To do the pre-allocation:

  1. Call nnp_convolution_inference with the same parameters that would be later used to compute convolution (albeit input, kernel, output and bias pointers can be NULL), and with workspace_buffer = NULL and non-NULL workspace_size. nnp_convolution_inference should return nnp_status_ok and write the size of scratch memory to *workspace_size.

  2. Allocate *workspace_size byte of scratch memory with at least 64-byte alignment.

  3. Call nnp_convolution_inference again with same parameters, but set workspace_buffer to the allocated scratch buffer pointer, and *workspace_size to the size of the allocate scratch buffer.

Igamegum commented 7 years ago

If I want to use workspace_buffer ,how to calculate workspace_size ? For example workspace_size = input_size.width * input_size.height .

Maratyszcza commented 7 years ago

The first call to nnp_convolution_inference (with workspace_buffer = NULL) will write workspace size to *workspace_size.

Igamegum commented 7 years ago

It work ,really very useful,thank you @Maratyszcza

Igamegum commented 7 years ago

I have see the source code as below if (workspace_buffer == NULL) { if (workspace_size == NULL) { memory_block = allocate_memory(memory_size); if (memory_block == NULL) { return nnp_status_out_of_memory; } } else { *workspace_size = memory_size; return nnp_status_success; } } else { if (*workspace_size < memory_size) { return nnp_status_insufficient_buffer; } memory_block = workspace_buffer; } It seems that only I call nnp_convolution_inference with workspace_buffer = NULL and workspace_size = NULL, it will allocate_memory automatic,but will not write workspace size to *workspace_size

if I call nnp_convolution_inference with workspace_buffer = NULL and workspace_size != NULL,the program will go dump

What should I do to get the workspace size real time?

Maratyszcza commented 7 years ago

Call with workspace_buffer = NULL and workspace_size != NULL is exactly what gets you workspace size, see https://github.com/Maratyszcza/NNPACK/blob/d02da48f0648f86906b6f0f515656dec7731b3aa/src/convolution-inference.c#L540. Note that:

  1. workspace_size must be a valid pointer to a size_t variable which will be overwritten with buffer size.
  2. nnp_convolution_inference will not compute anything in this call, but only write required buffer size to *workspace_size.
Igamegum commented 7 years ago

I follow your way,but why I get *workspace_size = 0,I ensure the input_size and output_size more than zero ,and kernel_size.width = kernel_size.height = 1

Igamegum commented 7 years ago

By the way ,dose nnp_convolution_inference is not support for android ?

Maratyszcza commented 7 years ago

If kernel_size.width = kernel_size.height = 1 and you specify nnp_convolution_algorithm_auto, NNPACK will use direct convolution algorithm which doesn't need any scratch memory. So, *workspace_size = 0 is not a bug.

nnp_convolution_inference, and all other NNPACK functions, work on all supported platforms, including Android.