NVIDIA-AI-IOT / face-mask-detection

Face Mask Detection using NVIDIA Transfer Learning Toolkit (TLT) and DeepStream for COVID-19
MIT License
241 stars 94 forks source link

Recommendations for improving precision #5

Closed hectormdom closed 3 years ago

hectormdom commented 4 years ago

The precision after training and evaluating the model with all the default values, is I think rather bad.

Validation cost: 0.000314
Mean average_precision (in %): 55.9855

class name      average precision (in %)
------------  --------------------------
mask                             87.7952
no-mask                          24.1759

Are there any recommendations for improving this precision? In particular the no-mask class got a really bad evaluation. I'm quite new to the whole AI thing, what I'm currently trying is (from a completely fresh docker) upping the number of epochs, from 120 to 270, changing the max learning rate to 1e-5 and batch size to 32 as per this source: https://developer.nvidia.com/blog/accelerating-video-analytics-tlt/ I'll report back with the results

This is my train file:

random_seed: 42
dataset_config {
  data_sources {
    tfrecords_path: "/workspace/tlt-ds-face_mask_detect/face-mask-detection-data/KITTI_gitcheck/tfrecords/kitti_trainval/*"
    image_directory_path: "/workspace/tlt-ds-face_mask_detect/face-mask-detection-data/KITTI_gitcheck/train"
  }
  image_extension: "jpg"
  target_class_mapping {
    key: "mask"
    value: "mask"
  }
  target_class_mapping {
    key: "no-mask"
    value: "no-mask"
  }
  validation_fold: 0
  #validation_data_source: {
    #tfrecords_path: "/workspace/tlt-ds-face_mask_detect/face-mask-detection-data/KITTI_gitcheck/tfrecords/kitti_trainval/*"
    #image_directory_path: "/workspace/tlt-ds-face_mask_detect/face-mask-detection-data/KITTI_gitcheck/test"
  #}
}

augmentation_config {
  preprocessing {
    output_image_width: 960
    output_image_height: 544
    min_bbox_width: 1.0
    min_bbox_height: 1.0
    output_image_channel: 3
  }
  spatial_augmentation {
    hflip_probability: 0.5
    vflip_probability: 0.0
    zoom_min: 1.0
    zoom_max: 1.0
    translate_max_x: 8.0
    translate_max_y: 8.0
  }
  color_augmentation {
    hue_rotation_max: 25.0
    saturation_shift_max: 0.20000000298
    contrast_scale_max: 0.10000000149
    contrast_center: 0.5
  }
}

postprocessing_config {
  target_class_config {
    key: "mask"
    value {
      clustering_config {
        coverage_threshold: 0.00499999988824
        dbscan_eps: 0.20000000298
        dbscan_min_samples: 0.0500000007451
        minimum_bounding_box_height: 20
      }
    }
  }
  target_class_config {
    key: "no-mask"
    value {
      clustering_config {
        coverage_threshold: 0.00499999988824
        dbscan_eps: 0.15000000596
        dbscan_min_samples: 0.0500000007451
        minimum_bounding_box_height: 20
      }
    }
  }
}

model_config {
  pretrained_model_file: "/workspace/tlt-ds-face_mask_detect/detectnet_v2/pretrained_resnet18/tlt_pretrained_detectnet_v2_vresnet18/resnet18.hdf5"
  num_layers: 18
  use_batch_norm: true
  objective_set {
    bbox {
      scale: 35.0
      offset: 0.5
    }
    cov {
    }
  }
  training_precision {
    backend_floatx: FLOAT32
  }
  arch: "resnet"
}

evaluation_config {
  validation_period_during_training: 10
  first_validation_epoch: 10
  minimum_detection_ground_truth_overlap {
    key: "mask"
    value: 0.5
  }
  minimum_detection_ground_truth_overlap {
    key: "no-mask"
    value: 0.5
  }
  evaluation_box_config {
    key: "mask"
    value {
      minimum_height: 20
      maximum_height: 9999
      minimum_width: 10
      maximum_width: 9999
    }
  }
  evaluation_box_config {
    key: "no-mask"
    value {
      minimum_height: 20
      maximum_height: 9999
      minimum_width: 10
      maximum_width: 9999
    }
  }
  average_precision_mode: INTEGRATE
}

cost_function_config {
  target_classes {
    name: "mask"
    class_weight: 1.0
    coverage_foreground_weight: 0.0500000007451
    objectives {
      name: "cov"
      initial_weight: 1.0
      weight_target: 1.0
    }
    objectives {
      name: "bbox"
      initial_weight: 10.0
      weight_target: 10.0
    }
  }
  target_classes {
    name: "no-mask"
    class_weight: 8.0
    coverage_foreground_weight: 0.0500000007451
    objectives {
      name: "cov"
      initial_weight: 1.0
      weight_target: 1.0
    }
    objectives {
      name: "bbox"
      initial_weight: 10.0
      weight_target: 1.0
    }
  }
  enable_autoweighting: true
  max_objective_weight: 0.999899983406
  min_objective_weight: 9.99999974738e-05
}

training_config {
  batch_size_per_gpu: 24
  num_epochs: 120
  learning_rate {
    soft_start_annealing_schedule {
      min_learning_rate: 5e-06
      max_learning_rate: 5e-04
      soft_start: 0.10000000149
      annealing: 0.699999988079
    }
  }
  regularizer {
    type: L1
    weight: 3.00000002618e-09
  }
  optimizer {
    adam {
      epsilon: 9.99999993923e-09
      beta1: 0.899999976158
      beta2: 0.999000012875
    }
  }
  cost_scaling {
    initial_exponent: 20.0
    increment: 0.005
    decrement: 1.0
  }
  checkpoint_interval: 10
}

bbox_rasterizer_config {
  target_class_config {
    key: "mask"
    value {
      cov_center_x: 0.5
      cov_center_y: 0.5
      cov_radius_x: 0.40000000596
      cov_radius_y: 0.40000000596
      bbox_min_radius: 1.0
    }
  }
  target_class_config {
    key: "no-mask"
    value {
      cov_center_x: 0.5
      cov_center_y: 0.5
      cov_radius_x: 1.0
      cov_radius_y: 1.0
      bbox_min_radius: 1.0
    }
  }
  deadzone_radius: 0.400000154972
}
hectormdom commented 4 years ago

Aforementioned changes did not achieve the expected results, it was obviously better because the epochs were higher, but not by much:

Validation cost: 0.000340
Mean average_precision (in %): 56.0311

class name      average precision (in %)
------------  --------------------------
mask                             81.023
no-mask                          31.0392

Median Inference Time: 0.011910
2020-08-31 06:45:34,728 [INFO] iva.detectnet_v2.scripts.evaluate: Evaluation complete.
Time taken to run iva.detectnet_v2.scripts.evaluate:main: 0:00:23.821871.

It took way longer than the last time on a single Quadro P5000 Time taken to run iva.detectnet_v2.scripts.train:main: 7:25:55.792716.

hectormdom commented 4 years ago

I think I found part of the problem, the output of tlt-dataset-convert -d $SPECS_DIR/detectnet_v2_tfrecords_kitti_train.txt -o $DATA_DOWNLOAD_DIR/tfrecords/kitti_trainval/kitti_trainval in the jupyter notebook is as follows:

Converting Tfrecords for kitti trainval dataset
2020-08-31 07:57:30.715631: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.0
Using TensorFlow backend.
2020-08-31 07:57:33,560 - iva.detectnet_v2.dataio.build_converter - INFO - Instantiating a kitti converter
2020-08-31 07:57:33,573 - iva.detectnet_v2.dataio.kitti_converter_lib - INFO - Num images in
Train: 2552 Val: 637
2020-08-31 07:57:33,573 - iva.detectnet_v2.dataio.kitti_converter_lib - INFO - Validation data in partition 0. Hence, while choosing the validationset during training choose validation_fold 0.
2020-08-31 07:57:33,576 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Writing partition 0, shard 0
WARNING:tensorflow:From /home/vpraveen/.cache/dazel/_dazel_vpraveen/715c8bafe7816f3bb6f309cd506049bb/execroot/ai_infra/bazel-out/k8-py3-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/dataio/dataset_converter_lib.py:142: The name tf.python_io.TFRecordWriter is deprecated. Please use tf.io.TFRecordWriter instead.

2020-08-31 07:57:33,576 - tensorflow - WARNING - From /home/vpraveen/.cache/dazel/_dazel_vpraveen/715c8bafe7816f3bb6f309cd506049bb/execroot/ai_infra/bazel-out/k8-py3-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/dataio/dataset_converter_lib.py:142: The name tf.python_io.TFRecordWriter is deprecated. Please use tf.io.TFRecordWriter instead.

/usr/local/lib/python3.6/dist-packages/iva/detectnet_v2/dataio/kitti_converter_lib.py:273: VisibleDeprecationWarning: Reading unicode strings without specifying the encoding argument is deprecated. Set the encoding, use None for the system default.
2020-08-31 07:57:33,650 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Writing partition 0, shard 1
2020-08-31 07:57:33,715 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Writing partition 0, shard 2
2020-08-31 07:57:33,781 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Writing partition 0, shard 3
2020-08-31 07:57:33,847 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Writing partition 0, shard 4
2020-08-31 07:57:33,912 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Writing partition 0, shard 5
2020-08-31 07:57:33,976 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Writing partition 0, shard 6
Top left coordinate must be less than bottom right.Error in object 0 of label_file /workspace/tlt-ds-face_mask_detect/face-mask-detection-data/KITTI_gitcheck/train/labels/train_00000526.txt. 
Coordinates: x1 = 560, x2 = 31, y1: 161, y2: 397
Skipping this object
2020-08-31 07:57:34,040 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Writing partition 0, shard 7
2020-08-31 07:57:34,105 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Writing partition 0, shard 8
Top left coordinate must be less than bottom right.Error in object 0 of label_file /workspace/tlt-ds-face_mask_detect/face-mask-detection-data/KITTI_gitcheck/train/labels/train_00005401.txt. 
Coordinates: x1 = 340, x2 = 72, y1: 175, y2: 360
Skipping this object
2020-08-31 07:57:34,169 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Writing partition 0, shard 9
2020-08-31 07:57:34,241 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - 
Wrote the following numbers of objects:
b'mask': 646
b'no-mask': 30

2020-08-31 07:57:34,241 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Writing partition 1, shard 0
Top left coordinate must be less than bottom right.Error in object 0 of label_file /workspace/tlt-ds-face_mask_detect/face-mask-detection-data/KITTI_gitcheck/train/labels/train_00005727.txt. 
Coordinates: x1 = 194, x2 = 78, y1: 180, y2: 105
Skipping this object
2020-08-31 07:57:34,501 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Writing partition 1, shard 1
2020-08-31 07:57:34,757 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Writing partition 1, shard 2
Top left coordinate must be less than bottom right.Error in object 0 of label_file /workspace/tlt-ds-face_mask_detect/face-mask-detection-data/KITTI_gitcheck/train/labels/train_00005772.txt. 
Coordinates: x1 = 121, x2 = 12, y1: 165, y2: 98
Skipping this object
2020-08-31 07:57:35,018 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Writing partition 1, shard 3
2020-08-31 07:57:35,284 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Writing partition 1, shard 4
Top left coordinate must be less than bottom right.Error in object 0 of label_file /workspace/tlt-ds-face_mask_detect/face-mask-detection-data/KITTI_gitcheck/train/labels/train_00000304.txt. 
Coordinates: x1 = 434, x2 = 60, y1: 96, y2: 464
Skipping this object
2020-08-31 07:57:35,559 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Writing partition 1, shard 5
2020-08-31 07:57:35,834 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Writing partition 1, shard 6
2020-08-31 07:57:36,101 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Writing partition 1, shard 7
Top left coordinate must be less than bottom right.Error in object 0 of label_file /workspace/tlt-ds-face_mask_detect/face-mask-detection-data/KITTI_gitcheck/train/labels/train_00000650.txt. 
Coordinates: x1 = 412, x2 = 50, y1: 133, y2: 509
Skipping this object
2020-08-31 07:57:36,357 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Writing partition 1, shard 8
Top left coordinate must be less than bottom right.Error in object 0 of label_file /workspace/tlt-ds-face_mask_detect/face-mask-detection-data/KITTI_gitcheck/train/labels/train_00006294.txt. 
Coordinates: x1 = 305, x2 = 222, y1: 127, y2: 72
Skipping this object
2020-08-31 07:57:36,619 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Writing partition 1, shard 9
Top left coordinate must be less than bottom right.Error in object 0 of label_file /workspace/tlt-ds-face_mask_detect/face-mask-detection-data/KITTI_gitcheck/train/labels/train_00005676.txt. 
Coordinates: x1 = 218, x2 = 402, y1: 192, y2: 4
Skipping this object
Top left coordinate must be less than bottom right.Error in object 0 of label_file /workspace/tlt-ds-face_mask_detect/face-mask-detection-data/KITTI_gitcheck/train/labels/train_00000572.txt. 
Coordinates: x1 = 639, x2 = 57, y1: 117, y2: 307
Skipping this object
2020-08-31 07:57:36,884 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - 
Wrote the following numbers of objects:
b'mask': 2661
b'no-mask': 109

2020-08-31 07:57:36,885 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Cumulative object statistics
2020-08-31 07:57:36,885 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - 
Wrote the following numbers of objects:
b'mask': 3307
b'no-mask': 139

2020-08-31 07:57:36,885 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Class map. 
Label in GT: Label in tfrecords file 
b'Mask': b'mask'
b'No-Mask': b'no-mask'
For the dataset_config in the experiment_spec, please use labels in the tfrecords file, while writing the classmap.

2020-08-31 07:57:36,885 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Tfrecords generation complete.

As one can see, the no-mask class does have significantly less objects. Let me know if you have any leads as to why this occurs, this issue is perhaps combined with the output of the data2kitti.py indicating 0 images for kaggle and fddb databases https://github.com/NVIDIA-AI-IOT/face-mask-detection/issues/4

hectormdom commented 4 years ago

Also I can't run as stand-alone the kaggle2kitti.py

I changed the paths to the ones in my system:

def main():
    images_dir = r'/workspace/tlt-ds-face_mask_detect/face-mask-detection-data/Kaggle_Medical_Mask_Dataset/images'
    labels_dir = r'/workspace/tlt-ds-face_mask_detect/face-mask-detection-data/Kaggle_Medical_Mask_Dataset/labels'
    kitti_base_dir = r'/workspace/tlt-ds-face_mask_detect/face-mask-detection-data/Kaggle_Medical_Mask_Dataset/KITTI_gitcheck'

And I get is this error: FileNotFoundError: [Errno 2] No such file or directory: '/workspace/tlt-ds-face_mask_detect/face-mask-detection-data/Kaggle_Medical_Mask_Dataset/KITTI_gitcheck/train/images/012420_coronoa_masks_web.jpg' I think it wants to test the labels, but it skips over the whole generation of such files

ak-nv commented 4 years ago

Ok. I could solve the issue with Kaggle. They updated class names as mask and none instead of good and bad. I committed my changes; please pull again. I get following output when I run data2kitti script

/face-mask-detection$ python3 data2kitti.py --kaggle-dataset-path /home/nvidia/face-mask-detection/datasets/medical-masks-dataset --mafa-dataset-path /home/nvidia/face-mask-detection/datasets/mafa --fddb-dataset-path  /home/nvidia/face-mask-detection/datasets/fddb --widerface-dataset-path /home/nvidia/face-mask-detection/datasets/widerface --kitti-base-path /home/nvidia/face-mask-detection/datasets/KITTI_1024 --category-limit 6000 --tlt-input-dims_width 960 --tlt-input-dims_height 544 --train
Kaggle Dataset: Total Mask faces: 4154 and No-Mask faces:790
Total Mask Labelled:4154 and No-Mask Labelled:790
Directory Already Exists
Directory Already Exists
/home/nvidia/face-mask-detection/data_utils/mafa2kitti.py:51: RuntimeWarning: overflow encountered in ubyte_scalars
  bbox = [_bbox_label[0], _bbox_label[1], _bbox_label[0]+_bbox_label[2], _bbox_label[1]+_bbox_label[3]]
MAFA Dataset: Total Mask faces: 1846 and No-Mask faces:232
Total Mask Labelled:6000 and No-Mask Labelled:1022
Directory Already Exists
Directory Already Exists
FDDB Dataset: Mask Labelled:0 and No-Mask Labelled:0
Total Mask Labelled:6000 and No-Mask Labelled:1022
WideFace: Total Mask Labelled:0 and No-Mask Labelled:4978
----------------------------
Final: Total Mask Labelled:6000
Total No-Mask Labelled:6000
----------------------------

Let me know if you have any issues. Thank you for looking into keenly.

hectormdom commented 4 years ago

Amazing! Thank you so much, I got the exact same output as you, the train part is solved and for now I'm going to try to train the whole thing, would you recommend default values? the ones you have in place inside the detectnet_v2_train_resnet18_kitti.txt file? By the way, I tried running your exact same command but with the --val flag, and it behaved exactly as before your fix:

root@545faf169507:/workspace/tlt-ds-face_mask_detect# python3 data2kitti.py --kaggle-dataset-path /workspace/tlt-ds-face_mask_detect/face-mask-detection-data/Kaggle_Medical_Mask_Dataset --mafa-dataset-path /workspace/tlt-ds-face_mask_detect/face-mask-detection-data/MAFA_Dataset --fddb-dataset-path /workspace/tlt-ds-face_mask_detect/face-mask-detection-data/FDDB_Dataset --widerface-dataset-path /workspace/tlt-ds-face_mask_detect/face-mask-detection-data/Wider_Face_Dataset --kitti-base-path /workspace/tlt-ds-face_mask_detect/face-mask-detection-data/KITTI_gitcheck --category-limit 6000 --tlt-input-dims_width 960 --tlt-input-dims_height 544 --val
Total Mask Labelled:0 and No-Mask Labelled:0
Directory Already Exists
Directory Already Exists
/workspace/tlt-ds-face_mask_detect/data_utils/mafa2kitti.py:81: RuntimeWarning: overflow encountered in ubyte_scalars
  bbox = [_bbox_label[0], _bbox_label[1], _bbox_label[0] + _bbox_label[2], _bbox_label[1] + _bbox_label[3]]
MAFA Dataset: Total Mask faces: 5002 and No-Mask faces:0
Total Mask Labelled:5002 and No-Mask Labelled:0
WideFace: Total Mask Labelled:0 and No-Mask Labelled:4858
----------------------------
Final: Total Mask Labelled:5002
Total No-Mask Labelled:4858
ak-nv commented 4 years ago

for now I'm going to try to train the whole thing, would you recommend default values?

Yes. That should do. I am just worried about Batch Size might not be suitable for your GPU size, please change that if needed, might impact accuracy a little.

By the way, I tried running your exact same command but with the --val flag, and it behaved exactly as before your fix:

I will push my changes with respect to that. 👍 For now, you might not need val data for default config though.

ak-nv commented 4 years ago

By the way, I tried running your exact same command but with the --val flag, and it behaved exactly as before your fix:

I could get val data as well. following is my log:

python3 data2kitti.py --kaggle-dataset-path /home/nvidia/face-mask-detection/datasets/medical-masks-dataset --mafa-dataset-path /home/nvidia/face-mask-detection/datasets/mafa --fddb-dataset-path  /home/nvidia/face-mask-detection/datasets/fddb --widerface-dataset-path /home/nvidia/face-mask-detection/datasets/widerface --kitti-base-path /home/nvidia/face-mask-detection/datasets/KITTI_1024 --category-limit 6000 --tlt-input-dims_width 960 --tlt-input-dims_height 544 --val
Total Mask Labelled:0 and No-Mask Labelled:0
Directory Already Exists
Directory Already Exists
/home/nvidia/face-mask-detection/data_utils/mafa2kitti.py:81: RuntimeWarning: overflow encountered in ubyte_scalars
  bbox = [_bbox_label[0], _bbox_label[1], _bbox_label[0] + _bbox_label[2], _bbox_label[1] + _bbox_label[3]]
MAFA Dataset: Total Mask faces: 5002 and No-Mask faces:0
Total Mask Labelled:5002 and No-Mask Labelled:0
WideFace: Total Mask Labelled:0 and No-Mask Labelled:4858
----------------------------
Final: Total Mask Labelled:5002
Total No-Mask Labelled:4858
----------------------------

what particular error you get with val dataset?

hectormdom commented 4 years ago

From the Total Mask Labelled:0 and No-Mask Labelled:0 I thought it was having the same issue not pulling anything from the kaggle dataset, unless it's supposed to go like that? By the way I do get your exact same output

ak-nv commented 4 years ago

Yeah. that print is after Kaggle Dataset , since Kaggle medical mask dataset does not have val dataset, it print 0,0 Val dataset will be only with MAFA and WiderFace.

hectormdom commented 4 years ago

I see, thanks for all your help, By the way heads up, I have no idea exactly where (i'll try to look for the exact reason) but I started from scratch completely to try and benchmark this time with batch size 12, and after calling the data2kitty.py script I only got 2026 items, whereas at exactly the moment you told me to pull again, I got 3046 items in the kitti folder. I just replaced all the newly pulled files with those in my backup and everything worked again with the 3046 items

hectormdom commented 3 years ago

Or perhaps in this last one you were doing the whole cherry picked 4K images mentioned in the post to achieve that great accuracy percentage?

hectormdom commented 3 years ago

The trainig results with a single Quadro P5000 are as follows, using everything default, just varying batch sizes:

Batch size 24
=========================

Validation cost: 0.000311
Mean average_precision (in %): 75.7608

class name      average precision (in %)
------------  --------------------------
mask                             84.8004
no-mask                          66.7212

Median Inference Time: 0.012529
2020-08-31 23:17:59,607 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 27.975
Time taken to run iva.detectnet_v2.scripts.train:main: 3:00:17.004748.
Batch size 12
=========================

Validation cost: 0.000313
Mean average_precision (in %): 77.3551

class name      average precision (in %)
------------  --------------------------
mask                             86.4036
no-mask                          68.3065

Median Inference Time: 0.012845
2020-09-01 03:33:07,570 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 27.130
Time taken to run iva.detectnet_v2.scripts.train:main: 3:06:50.549653.

Next I'll try the newest defaults with batch size 12 and see how it goes

hectormdom commented 3 years ago

Pulled on 31st August @ 23:00 Running everything on default but the batch size, which changed to 12, training results after 120 epochs:

Mean average_precision (in %): 56.8767

class name      average precision (in %)
------------  --------------------------
mask                             79.2101
no-mask                          34.5432

(2026 images and 2026 labels in the training folder)

hectormdom commented 3 years ago

I only got 2026 items, whereas at exactly the moment you told me to pull again, I got 3046 items in the kitti folder.

I narrowed it down to widerface2kitti.py line 33, file_image = os.path.splitext(image_name)[0].split('\\')[1] prevents it from processing any images, perhaps it's a linux/windows thing, I changed it to file_image = os.path.splitext(os.path.split(image_name)[1])[0] and now I get the 3046 items for which I get better accuracy (still not the accuracy in the blog though but closer).

ak-nv commented 3 years ago

With Batch Size 12, we have not checked the accuracy performance. But you might need to adjust other hyper-params as well.

ak-nv commented 3 years ago

Closing as no update since 10 days, please re-open if any question.