facebookresearch / SlowFast

PySlowFast: video understanding codebase from FAIR for reproducing state-of-the-art video models.
Apache License 2.0
6.41k stars 1.19k forks source link

Grad-CAM does not write to file #377

Open Fritskee opened 3 years ago

Fritskee commented 3 years ago

I managed to train a SlowFast model (8x8) for the Kinetics data, now I am trying to run the demo for this model. The goal is to write the Grad-CAM results for 1 video to a file.

As suggested in the documentation, I run the following command:

python tools/demo_net.py \
        --cfg configs/Kinetics/SLOWFAST_8x8_R50_DEMO.yaml \
        NUM_GPUS 1 \
        TRAIN.ENABLE False \
        DATA.PATH_TO_DATA_DIR /work/pe432442/slowfast/slowfast/data/Kinetics \
        TEST.CHECKPOINT_FILE_PATH /work/pe432442/slowfast/slowfast/checkpoints/SLOWFAST_8X8_R50_epoch_00014.pyth

Besides that I have also added the "DEMO" part to the .yaml config file. This looks as follows:

OUTPUT_DIR: /work/pe432442/slowfast/slowfast/output_8x8

TENSORBOARD:
  ENABLE: True
  CONFUSION_MATRIX:
    ENABLE: True
  HISTOGRAM:
    ENABLE: False
  MODEL_VIS:
    ENABLE: True
    MODEL_WEIGHTS: True
    ACTIVATIONS: True
    INPUT_VIDEO: True
    GRAD_CAM:
      ENABLE: True

DEMO:
  ENABLE: True
  LABEL_FILE_PATH: /work/pe432442/slowfast/slowfast/data/Kinetics/classids.json
  INPUT_VIDEO: /home/pe432442/Desktop/some_video.mp4
  OUTPUT_FILE: /home/pe432442/Desktop/sf-out
  THREAD_ENABLE: True
  NUM_VIS_INSTANCES: 1
  NUM_CLIPS_SKIP: 2

Then slowfast/defaults.py looks as follows regarding the Tensorboard visualization section and the demo section:

# -----------------------------------------------------------------------------
# Tensorboard Visualization Options
# -----------------------------------------------------------------------------
_C.TENSORBOARD = CfgNode()

# Log to summary writer, this will automatically.
# log loss, lr and metrics during train/eval.
_C.TENSORBOARD.ENABLE = True
# Provide path to prediction results for visualization.
# This is a pickle file of [prediction_tensor, label_tensor]
_C.TENSORBOARD.PREDICTIONS_PATH = "/work/pe432442/slowfast/slowfast/output_8x8/TENSOR"
# Path to directory for tensorboard logs.
# Default to to cfg.OUTPUT_DIR/runs-{cfg.TRAIN.DATASET}.
_C.TENSORBOARD.LOG_DIR = "/work/pe432442/slowfast/slowfast/output_8x8/TENSOR"
# Path to a json file providing class_name - id mapping
# in the format {"class_name1": id1, "class_name2": id2, ...}.
# This file must be provided to enable plotting confusion matrix
# by a subset or parent categories.
_C.TENSORBOARD.CLASS_NAMES_PATH = "/work/pe432442/slowfast/slowfast/data/Kinetics/classids.json"

# Path to a json file for categories -> classes mapping
# in the format {"parent_class": ["child_class1", "child_class2",...], ...}.
_C.TENSORBOARD.CATEGORIES_PATH = ""

# Config for confusion matrices visualization.
_C.TENSORBOARD.CONFUSION_MATRIX = CfgNode()
# Visualize confusion matrix.
_C.TENSORBOARD.CONFUSION_MATRIX.ENABLE = True
# Figure size of the confusion matrices plotted.
_C.TENSORBOARD.CONFUSION_MATRIX.FIGSIZE = [8, 8]
# Path to a subset of categories to visualize.
# File contains class names separated by newline characters.
_C.TENSORBOARD.CONFUSION_MATRIX.SUBSET_PATH = "/work/pe432442/slowfast/slowfast/data/Kinetics/classes.txt"

# Config for histogram visualization.
_C.TENSORBOARD.HISTOGRAM = CfgNode()
# Visualize histograms.
_C.TENSORBOARD.HISTOGRAM.ENABLE = True
# Path to a subset of classes to plot histograms.
# Class names must be separated by newline characters.
_C.TENSORBOARD.HISTOGRAM.SUBSET_PATH = "/work/pe432442/slowfast/slowfast/data/Kinetics/classes.txt"
# Visualize top-k most predicted classes on histograms for each
# chosen true label.
_C.TENSORBOARD.HISTOGRAM.TOPK = 8
# Figure size of the histograms plotted.
_C.TENSORBOARD.HISTOGRAM.FIGSIZE = [8, 8]

# Config for layers' weights and activations visualization.
# _C.TENSORBOARD.ENABLE must be True.
_C.TENSORBOARD.MODEL_VIS = CfgNode()

# If False, skip model visualization.
_C.TENSORBOARD.MODEL_VIS.ENABLE = True

# If False, skip visualizing model weights.
_C.TENSORBOARD.MODEL_VIS.MODEL_WEIGHTS = False

# If False, skip visualizing model activations.
_C.TENSORBOARD.MODEL_VIS.ACTIVATIONS = False

# If False, skip visualizing input videos.
_C.TENSORBOARD.MODEL_VIS.INPUT_VIDEO = True

# List of strings containing data about layer names and their indexing to
# visualize weights and activations for. The indexing is meant for
# choosing a subset of activations outputed by a layer for visualization.
# If indexing is not specified, visualize all activations outputed by the layer.
# For each string, layer name and indexing is separated by whitespaces.
# e.g.: [layer1 1,2;1,2, layer2, layer3 150,151;3,4]; this means for each array `arr`
# along the batch dimension in `layer1`, we take arr[[1, 2], [1, 2]]
_C.TENSORBOARD.MODEL_VIS.LAYER_LIST = ['s5/pathway1_res2', 's5/pathway0_res2']
# Top-k predictions to plot on videos
_C.TENSORBOARD.MODEL_VIS.TOPK_PREDS = 1
# Colormap to for text boxes and bounding boxes colors
_C.TENSORBOARD.MODEL_VIS.COLORMAP = "Pastel2"
# Config for visualization video inputs with Grad-CAM.
# _C.TENSORBOARD.ENABLE must be True.
_C.TENSORBOARD.MODEL_VIS.GRAD_CAM = CfgNode()
# Whether to run visualization using Grad-CAM technique.
_C.TENSORBOARD.MODEL_VIS.GRAD_CAM.ENABLE = True
# CNN layers to use for Grad-CAM. The number of layers must be equal to
# number of pathway(s).
_C.TENSORBOARD.MODEL_VIS.GRAD_CAM.LAYER_LIST = ['s5/pathway1_res2', 's5/pathway0_res2']
# If True, visualize Grad-CAM using true labels for each instances.
# If False, use the highest predicted class.
_C.TENSORBOARD.MODEL_VIS.GRAD_CAM.USE_TRUE_LABEL = True
# Colormap to for text boxes and bounding boxes colors
_C.TENSORBOARD.MODEL_VIS.GRAD_CAM.COLORMAP = "viridis"

# Config for visualization for wrong prediction visualization.
# _C.TENSORBOARD.ENABLE must be True.
_C.TENSORBOARD.WRONG_PRED_VIS = CfgNode()
_C.TENSORBOARD.WRONG_PRED_VIS.ENABLE = False
# Folder tag to origanize model eval videos under.
_C.TENSORBOARD.WRONG_PRED_VIS.TAG = "Incorrectly classified videos."
# Subset of labels to visualize. Only wrong predictions with true labels
# within this subset is visualized.
_C.TENSORBOARD.WRONG_PRED_VIS.SUBSET_PATH = ""

# ---------------------------------------------------------------------------- #
# Demo options
# ---------------------------------------------------------------------------- #
_C.DEMO = CfgNode()

# Run model in DEMO mode.
_C.DEMO.ENABLE = False

# Path to a json file providing class_name - id mapping
# in the format {"class_name1": id1, "class_name2": id2, ...}.
_C.DEMO.LABEL_FILE_PATH = ""

# Specify a camera device as input. This will be prioritized
# over input video if set.
# If -1, use input video instead.
_C.DEMO.WEBCAM = -1

# Path to input video for demo.
_C.DEMO.INPUT_VIDEO = ""
# Custom width for reading input video data.
_C.DEMO.DISPLAY_WIDTH = 0
# Custom height for reading input video data.
_C.DEMO.DISPLAY_HEIGHT = 0
# Path to Detectron2 object detection model configuration,
# only used for detection tasks.
_C.DEMO.DETECTRON2_CFG = "COCO-Detection/faster_rcnn_R_50_FPN_3x.yaml"
# Path to Detectron2 object detection model pre-trained weights.
_C.DEMO.DETECTRON2_WEIGHTS = "detectron2://COCO-Detection/faster_rcnn_R_50_FPN_3x/137849458/model_final_280758.pkl"
# Threshold for choosing predicted bounding boxes by Detectron2.
_C.DEMO.DETECTRON2_THRESH = 0.9
# Number of overlapping frames between 2 consecutive clips.
# Increase this number for more frequent action predictions.
# The number of overlapping frames cannot be larger than
# half of the sequence length `cfg.DATA.NUM_FRAMES * cfg.DATA.SAMPLING_RATE`
_C.DEMO.BUFFER_SIZE = 0
# If specified, the visualized outputs will be written this a video file of
# this path. Otherwise, the visualized outputs will be displayed in a window.
_C.DEMO.OUTPUT_FILE = "/work/pe432442/slowfast/slowfast/output_8x8/"
# Frames per second rate for writing to output video file.
# If not set (-1), use fps rate from input file.
_C.DEMO.OUTPUT_FPS = -1
# Input format from demo video reader ("RGB" or "BGR").
_C.DEMO.INPUT_FORMAT = "BGR"
# Draw visualization frames in [keyframe_idx - CLIP_VIS_SIZE, keyframe_idx + CLIP_VIS_SIZE] inclusively.
_C.DEMO.CLIP_VIS_SIZE = 10
# Number of processes to run video visualizer.
_C.DEMO.NUM_VIS_INSTANCES = 2

I am very confused why my video is not being written to file, when running the code with above specifications, the terminal only returns the following:

(OK) Unloading intelmpi 2018.4.274
(OK) Unloading Intel Suite 19.0.1.144
(OK) Loading gcc 7.3.0
(!!) OpenACC / OpenMP offload to Pascal GPUs might be slow. Use 
(!!)  $FLAGS_OFFLOAD_OPENMP or $FLAGS_OFFLOAD_OPENACC envvars.
(OK) Intel MPI Suite 2018.4.274 loaded.
(OK) Loading cuda 10.2.89
(OK) Loading cudnn 8.0
(OK) Loading NVIDIA v. 20.9 suite: compilers (C/C++/FORTRAN), CUDA (10.2), GPU math libaries, and Nsight tools
** fvcore version of PathManager will be deprecated soon. **
** Please migrate to the version in iopath repo. **
https://github.com/facebookresearch/iopath 

** fvcore version of PathManager will be deprecated soon. **
** Please migrate to the version in iopath repo. **
https://github.com/facebookresearch/iopath 

After which it terminates without throwing any errors. An important detail is that I am working from a compute cluster and thus have no screen available when the job is running. However, since I provide a directory to write the output to, this shouldn't be an issue. I have also ran experiments with the Detectron2 repository and this works fine (i.e. writes files to output directory).

Fritskee commented 3 years ago

@dianyo @Shumpei-Kikuta @guojingbuaa @wwdok @haooooooqi  @lyttonhao @asafDahan @AlexanderMelde Anybody got any insights on this matter?

Shumpei-Kikuta commented 3 years ago

@Fritskee I'm not sure how to use demo code, but perhaps I think that you should change the visualization code so that you can get the output not from tensorboard but local file path. You can write the local file path by changing this part.

Fritskee commented 3 years ago

@Fritskee I'm not sure how to use demo code, but perhaps I think that you should change the visualization code so that you can get the output not from tensorboard but local file path. You can write the local file path by changing this part.

Thanks for the reply! I didn’t see that. But I did put the pathways and all the other jazz in the configs, thinking that’d be sufficient. Thanks for the help!