Closed Igamegum closed 7 years ago
Output subsampling is what is called convolution stride in most deep learning frameworks. Note that nnp_convolution_inference
supports output subsampling only with implicit GEMM algorithm (but you can specify auto
algorithm too).
workspace arguments let you pre-allocate scratch buffers used by convolution implementation to avoid the overhead of allocating/deallocating these buffers inside the nnp_convolution_inference
function. This helps improves performance, especially on small convolutions. To do the pre-allocation:
Call nnp_convolution_inference
with the same parameters that would be later used to compute convolution (albeit input
, kernel
, output
and bias
pointers can be NULL
), and with workspace_buffer = NULL
and non-NULL workspace_size
. nnp_convolution_inference
should return nnp_status_ok
and write the size of scratch memory to *workspace_size
.
Allocate *workspace_size
byte of scratch memory with at least 64-byte alignment.
Call nnp_convolution_inference
again with same parameters, but set workspace_buffer
to the allocated scratch buffer pointer, and *workspace_size
to the size of the allocate scratch buffer.
If I want to use workspace_buffer ,how to calculate workspace_size ? For example workspace_size = input_size.width * input_size.height .
The first call to nnp_convolution_inference
(with workspace_buffer = NULL
) will write workspace size to *workspace_size
.
It work ,really very useful,thank you @Maratyszcza
I have see the source code as below
if (workspace_buffer == NULL) { if (workspace_size == NULL) { memory_block = allocate_memory(memory_size); if (memory_block == NULL) { return nnp_status_out_of_memory; } } else { *workspace_size = memory_size; return nnp_status_success; } } else { if (*workspace_size < memory_size) { return nnp_status_insufficient_buffer; } memory_block = workspace_buffer; }
It seems that only I call nnp_convolution_inference with workspace_buffer = NULL and workspace_size = NULL, it will allocate_memory automatic,but will not write workspace size to *workspace_size
if I call nnp_convolution_inference with workspace_buffer = NULL and workspace_size != NULL,the program will go dump
What should I do to get the workspace size real time?
Call with workspace_buffer = NULL
and workspace_size != NULL
is exactly what gets you workspace size, see https://github.com/Maratyszcza/NNPACK/blob/d02da48f0648f86906b6f0f515656dec7731b3aa/src/convolution-inference.c#L540. Note that:
workspace_size
must be a valid pointer to a size_t
variable which will be overwritten with buffer size.nnp_convolution_inference
will not compute anything in this call, but only write required buffer size to *workspace_size
.I follow your way,but why I get *workspace_size = 0,I ensure the input_size and output_size more than zero ,and kernel_size.width = kernel_size.height = 1
By the way ,dose nnp_convolution_inference is not support for android ?
If kernel_size.width = kernel_size.height = 1
and you specify nnp_convolution_algorithm_auto
, NNPACK will use direct convolution algorithm which doesn't need any scratch memory. So, *workspace_size = 0
is not a bug.
nnp_convolution_inference
, and all other NNPACK functions, work on all supported platforms, including Android.
Hmm , It is glad to see that NNPack increasing much commit,it seem will performance better,but I notice that some functions have changed ,and I dnot know the meaning of these params,May be I can get documents some where?
I try to use nnp_convolution_inference ,but I can not get the right result. I believe my use way is wrong,so I want to know how to use the new params ,what do these parameters mean
However , I Hope your help , thanks !