Maratyszcza / NNPACK

Acceleration package for neural networks on multi-core CPUs
BSD 2-Clause "Simplified" License
1.68k stars 316 forks source link

NNPACK randomly crashes in specific case #153

Closed wnagchenghku closed 6 years ago

wnagchenghku commented 6 years ago

Dear NNPACK author,

We are trying to deploy NNPACK in our production environment, but it segmentation faults in a simple case below. It is interesting that it crashes randomly: sometimes it runs with segfault, sometimes without any error.

#include <nnpack.h>
#include <string.h>
#include <stdlib.h>
#include <malloc.h>
#include <stdio.h>
#include <assert.h>
#include <sys/time.h>

void conv_1(float *T0, float *T1, char * workspaceBuffer, size_t * workspaceSize) {
        float *bias = (float[64]){0};
        float *kernelData = (float[9408]){0};
        enum nnp_activation activation = nnp_activation_identity;
        enum nnp_convolution_algorithm algorithm = nnp_convolution_algorithm_auto;
        enum nnp_convolution_transform_strategy strategy = nnp_convolution_transform_strategy_compute;
        enum nnp_status status = nnp_convolution_inference(algorithm, strategy, 3, 64, (struct nnp_size){.width = 224, .height = 224}, (struct nnp_padding){.top = 3, .right = 3, .bottom = 3, .left = 3}, (struct nnp_size){.width = 7, .height = 7}, (struct nnp_size){.width = 2, .height = 2}, T0, kernelData, bias, T1, workspaceBuffer, workspaceSize, activation, NULL, NULL, NULL);
}

void forward(float *T0) {
        int sizes[4] = {1, 3, 224, 224};

        char workspaceBuffer[39096320];
        size_t workspaceSize = 39096320;
        float T1[802816];
        conv_1(T0, T1, workspaceBuffer, &workspaceSize);
        float T4[1 * 64 * 56 * 56];
        return;
}

int main(int argc, char const *argv[]) {
        enum nnp_status status = nnp_initialize();
        assert(status == nnp_status_success);
        float T0[1 * 3 * 224 * 224];
        for(int i = 0; i < 1 * 3 * 224 * 224; i++)
                T0[i] = 1;
        forward(T0);
        status = nnp_deinitialize();
        assert(status == nnp_status_success);
        return 0;
}

Some information about our environment:

We tried to gdb it, it segfaults in nnp_sgemm_only_4x24__fma3(). Since there is no debug symbol in peachpy generated object files, we are not able to go further.

To run this simple example, we set ulimit -s unlimited to make unlimited stack size.

Please inform us if any more information is needed to fix it.

Many thanks for your time.

wnagchenghku commented 6 years ago

Update:

I have tested this code with NNPACK on Debian 9 (gcc 6.3, kernel 4.9). Still the same error. Anything wrong with this code?

wnagchenghku commented 6 years ago

I guess I have somehow found the problem. The allocated workspace size is not aligned.