AlexeyAB / darknet

YOLOv4 / Scaled-YOLOv4 / YOLO - Neural Networks for Object Detection (Windows and Linux version of Darknet )
http://pjreddie.com/darknet/
Other
21.77k stars 7.96k forks source link

problems of the training of classification #3682

Open lunasdejavu opened 5 years ago

lunasdejavu commented 5 years ago

My environment is Ubuntu 16.04 CUDA10.0 GPU nvidia GTX1080ti total used free shared buff/cache available Mem: 125G 8.5G 13G 827M 104G 115G Swap: 975M 62M 913M

First I used darknet19 to classify the cropped license plates format into 8 classes like ab-1234 is 2-4, ABC-123 is 3-3 like https://imgur.com/zladffy It can be detected perfectly by yolov2 since I choose the whole image like a bounding box. but I failed when I tried to do the classification by ./darknet classifier train I used darknet19.cfg flip =0 ;filter=8; I modified the code a little to parse the txt file

void fill_truth_class(char *path, char **labels, int k, float *truth)
{
    int id;
    //1.generate txt path
    char classpath[4096];
    find_replace(path, ".jpg", ".txt", classpath);
    //2.open txt file
    FILE *file = fopen(classpath, "r");
    if(!file) file_error(classpath);
    float x, y, h, w;
    while ( fscanf(file, "%d %f %f %f %f", &id,&x,&y,&h,&w)==5){
    //3.read the class of the license plate
   }
   fclose(file);
    memset(truth, 0, k*sizeof(float));
    truth[id] = 1;
}

and I printed the the truth variable and there is nothing wrong with it but during the training the loss dropped a little and converged immediately. I tried to tune the learning rate lower but it is always the same. the result is all wrong the predict class id is always 1. So since the label part is OK, I tried to close the augmentation in this function

matrix load_image_augment_paths(char **paths, int n, int use_flip, int min, int max, int size, float angle, float aspect, float hue, float saturation, float exposure)
{
    int i;
    matrix X;
    X.rows = n;
    X.vals = (float**)calloc(X.rows, sizeof(float*));
    X.cols = 0;

    for(i = 0; i < n; ++i){
        printf("path[%d]:%s\n",i,paths[i]);
        fflush(stdout);
        image im = load_image_color(paths[i], 0, 0);
        image crop = random_augment_image(im, angle, aspect, min, max, size);
        //image crop = copy_image(im);
        int flip = use_flip ? random_gen() % 2 : 0;
        if (flip)
            flip_image(crop);
        random_distort_image(crop, hue, saturation, exposure);

        /*
        show_image(im, "orig");
        show_image(crop, "crop");
        cvWaitKey(0);

        */

the augmentation parameters of random_augment_image(im, angle, aspect, min, max, size); use_flip:0,min:256,max:512,size:256,angle:0.000000,aspect:1.000000

but I still tried to justimage crop = load_image_color(paths[i], 0, 0); or return imin random_augment_image(im, angle, aspect, min, max, size); then it showed Segmentation fault after a few iterations

I printed the messages

matrix load_image_augment_paths(char **paths, int n, int use_flip, int min, int max, int size, float angle, float aspect, float hue, float saturation, float exposure)
{
    int i;
    matrix X;
    X.rows = n;
    X.vals = (float**)calloc(X.rows, sizeof(float*));
    X.cols = 0;
    for(i = 0; i < n; ++i){
        printf("path[%d]:%s\n",i,paths[i]);
        fflush(stdout);
        image im = load_image_color(paths[i], 0, 0);
        image crop = random_augment_image(im, angle, aspect, min, max, size);
        //image crop = copy_image(im);
        int flip = use_flip ? random_gen() % 2 : 0;
        if (flip)
            flip_image(crop);
        random_distort_image(crop, hue, saturation, exposure);

        /*
        show_image(im, "orig");
        show_image(crop, "crop");
        cvWaitKey(0);

        */
        printf("Ocrop.h:%d\n",crop.h);

        fflush(stdout);
        free_image(im);
        X.vals[i] = crop.data;
        printf("Acrop.h:%d\n",crop.h);
        fflush(stdout);
        X.cols = crop.h*crop.w*crop.c;
    }
    return X;
}

then it showed

Loading weights from darknet19_448.conv.23...
 seen 32
Done!
Learnin gRate: 0.0005, Momentum: 0.9, Decay: 0.0005
path[0]:/data/datasets/fake_plate/ChePai_new_train/CU595.jpg
path[0]:/data/datasets/fake_plate/ChePai_new_train/VGV1677.jpg
.
.
.
.
path[0]:/data/datasets/fake_plate/ChePai_old_train/9RY6.jpg
Ocrop.h:84
Acrop.h:84
path[1]:/data/datasets/fake_plate/ChePai_new_train/AQG503.jpg
Ocrop.h:84
Acrop.h:84
path[1]:/data/datasets/fake_plate/ChePai_new_train/QJL696.jpg
path[0]:/data/datasets/fake_plate/ChePai_new_train/VZY5790.jpg
path[0]:/data/datasets/fake_plate/ChePai_old_train/T18W.jpg
path[0]:/data/datasets/fake_plate/ChePai_new_train/LHW265.jpg
Ocrop.h:84
Acrop.h:84
path[1]:/data/datasets/fake_plate/ChePai_old_train/E81EY.jpg
Ocrop.h:84
Acrop.h:84
path[1]:/data/datasets/fake_plate/ChePai_new_train/DF004.jpg
Segmentation fault

so can someone who is familiar with the source code of classification help?

EhsanVahab commented 4 years ago

I suppose that your gpu memory is not enough for the training. what are your config file parameters?