facebookresearch / SlowFast

PySlowFast: video understanding codebase from FAIR for reproducing state-of-the-art video models.
Apache License 2.0
6.52k stars 1.2k forks source link

How to modify number of classes properly #449

Open YNawal opened 3 years ago

YNawal commented 3 years ago

Hello all

I need to modify the number of classes in the yaml file. But the one top error is very bad I tried to modify learning rate, but I think I missed someting. So how can I modify number of classes properly? Thanks for your help

YNawal commented 3 years ago

I work on action recognition with custom data (7 classes) I fit it on Kinitecs dataset format I started with I3D_8x8_R50 model the top-1 err is around 68 to 75% Too bad accuracy Where I'm wrong?

yaml file

TRAIN: ENABLE: True DATASET: mydata BATCH_SIZE: 64 EVAL_PERIOD: 10 CHECKPOINT_PERIOD: 1 AUTO_RESUME: True DATA: NUM_FRAMES: 8 SAMPLING_RATE: 8 TRAIN_JITTER_SCALES: [256, 320] TRAIN_CROP_SIZE: 224 TEST_CROP_SIZE: 256 INPUT_CHANNEL_NUM: [3] RESNET: ZERO_INIT_FINAL_BN: True WIDTH_PER_GROUP: 64 NUM_GROUPS: 1 DEPTH: 50 TRANS_FUNC: bottleneck_transform STRIDE_1X1: False NUM_BLOCK_TEMP_KERNEL: [[3], [4], [6], [3]] NONLOCAL: LOCATION: [[[]], [[]], [[]], [[]]] GROUP: [[1], [1], [1], [1]] INSTANTIATION: softmax BN: USE_PRECISE_STATS: True NUM_BATCHES_PRECISE: 200 SOLVER: BASE_LR: 0.001 LR_POLICY: cosine MAX_EPOCH: 196 MOMENTUM: 0.9 WEIGHT_DECAY: 1e-4 WARMUP_EPOCHS: 34.0 WARMUP_START_LR: 0.00001 OPTIMIZING_METHOD: sgd MODEL: NUM_CLASSES: 7 ARCH: i3d MODEL_NAME: ResNet LOSS_FUNC: cross_entropy DROPOUT_RATE: 0.5 TEST: ENABLE: True DATASET: mydata BATCH_SIZE: 64 DATA_LOADER: NUM_WORKERS: 2 PIN_MEMORY: True NUM_GPUS: 2 NUM_SHARDS: 1 RNG_SEED: 0 OUTPUT_DIR: .

TENSORBOARD: ENABLE: True LOG_DIR: # Leave empty to use cfg.OUTPUT_DIR/runs-{cfg.TRAIN.DATASET} as path. CLASS_NAMES_PATH: # Path to json file providing class_name - id mapping. CONFUSION_MATRIX: ENABLE: True SUBSET_PATH: # Path to txt file contains class names separated by newline characters.

Only classes in this file will be visualized in the confusion matrix.

HISTOGRAM: ENABLE: True TOP_K: 10 # Top-k most frequently predicted classes for each class in the dataset. SUBSET_PATH: # Path to txt file contains class names separated by newline characters.

Only classes in this file will be visualized with histograms.

TENSORBOARD: ENABLE: True MODEL_VIS: ENABLE: True MODEL_WEIGHTS: # Set to True to visualize model weights. ACTIVATIONS: # Set to True to visualize feature maps. INPUT_VIDEO: # Set to True to visualize the input video(s) for the corresponding feature maps. LAYER_LIST: # List of layer names to visualize weights and activations for. GRAD_CAM: ENABLE: True LAYER_LIST: # List of CNN layers to use for Grad-CAM visualization method.

The number of layer must be equal to the number of pathway(s).

DEMO: ENABLE: True LABEL_FILE_PATH: # Path to json file providing class_name - id mapping. INPUT_VIDEO: # Path to input video file. OUTPUT_FILE: # Path to output video file to write results to.

Leave an empty string if you would like to display results to a window.

THREAD_ENABLE: # Run video reader/writer in the background with multi-threading. NUM_VIS_INSTANCES: # Number of CPU(s)/processes use to run video visualizer. NUM_CLIPS_SKIP: # Number of clips to skip prediction/visualization

(mostly to smoothen/improve display quality with wecam input).

kailliang commented 3 years ago

Similar problem here. I have 3 classes.