ammar3010 commented 1 year ago

I am trying to train a custom video classifier using pytorchvideo. The csv file looks like this:

file	label
UCF_car_threat/normal/Normal_Videos_603_x264.mp4	0
UCF_car_threat/threat/Shooting019_x264.mp4	1
UCF_car_threat/threat/Shooting030_x264.mp4	1
UCF_car_threat/threat/Stealing009_x264.mp4	1
UCF_car_threat/normal/Normal_Videos_401_x264.mp4	0

I am creating dataloader using this:

`from torch.utils.data import DataLoader train_dataset = LabeledVideoDataset(train_df, clip_sampler=make_clip_sampler('random', 2), transform=video_transform, decode_audio=False) loader = DataLoader(train_dataset, batch_size=3, num_workers=0, pin_memory=True)

batch = next(iter(loader))`

`RuntimeError Traceback (most recent call last) Cell In[28], line 1 ----> 1 batch = next(iter(loader))

File ~/anaconda3/envs/threat/lib/python3.10/site-packages/torch/utils/data/dataloader.py:633, in _BaseDataLoaderIter.next(self) 630 if self._sampler_iter is None: 631 # TODO(https://github.com/pytorch/pytorch/issues/76750) 632 self._reset() # type: ignore[call-arg] --> 633 data = self._next_data() 634 self._num_yielded += 1 635 if self._dataset_kind == _DatasetKind.Iterable and \ 636 self._IterableDataset_len_called is not None and \ 637 self._num_yielded > self._IterableDataset_len_called:

File ~/anaconda3/envs/threat/lib/python3.10/site-packages/torch/utils/data/dataloader.py:677, in _SingleProcessDataLoaderIter._next_data(self) 675 def _next_data(self): 676 index = self._next_index() # may raise StopIteration --> 677 data = self._dataset_fetcher.fetch(index) # may raise StopIteration 678 if self._pin_memory: 679 data = _utils.pin_memory.pin_memory(data, self._pin_memory_device)

File ~/anaconda3/envs/threat/lib/python3.10/site-packages/torch/utils/data/_utils/fetch.py:32, in _IterableDatasetFetcher.fetch(self, possibly_batchedindex) 30 for in possibly_batched_index: 31 try: ... --> 225 raise RuntimeError( 226 f"Failed to load video after {self._MAX_CONSECUTIVE_FAILURES} retries." 227 )

RuntimeError: Failed to load video after 10 retries.`

alpargun commented 7 months ago

Can you create a dataset registry using the SlowFast template? You can follow the files in the datasets/ directory. I was able to register my own datasets both for a public dataset (BDD100K) and a custom collected dataset (CARLA), and train on them as you can see in my fork.

You can just use e.g. kinetics.py as a reference and define your custom dataset. Then, for the training just follow the same steps as in Kinetics training, where you just change the dataset configuration in the config YAML file.

By the way, don't forget to modify slowfast/datasets/init.py, as well

AbrarKhan009 commented 2 months ago

@alpargun Hi bro, I hope you are doing well. I have one question. I am trying to train the Mvitv2 slowfast model for video classification tasks on my own custom datasets. I have followed the Kinetics dataset format for making my custom dataset and their csv files, but I did not modify the slowfast/datasets/init.py file. So my question is, what modification do I need to make in that init.py file?

alpargun commented 2 months ago

Hi @AbrarKhan009, you can check out my fork where I already introduced two custom datasets. You can find how I modified slowfast/datasets/init.py, added bdd.py and carla.py files in slowfast/datasets for my custom datasets. To introduce your own .py file, you can copy e.g. kinetics.py file and modify it according to your dataset's properties such as framerate, crop size, resolution etc.

AbrarKhan009 commented 1 month ago

Hello Everyone my training is done on custom dataset i have some question regarding the outputs, train_net.py: 759: training done: _p50.93_f225.17 _t12.31_m10.69 _a25.00 Top5 Acc: 66.67 MEM: 10.69 f: 225.1698 can somebody explain this final message to me.

also why during the training lr is always 0.000

alpargun commented 1 month ago

You can find the result string in tools/train_net.py: https://github.com/facebookresearch/SlowFast/blob/bac7b672f40d44166a84e8c51d1a5ba367ace816/tools/train_net.py#L742

So,

p: number of params
f: number of flops
t: median epoch time
m: GPU memory usage
a: top 1 class accuracy
Top5 Acc: top 5 class accuracy

Why LR is always 0 is interesting but I cannot comment without seeing your experiment config and the output which shows the LR

AbrarKhan009 commented 1 month ago

Output screenshot for showing lr 0

AbrarKhan009 commented 1 month ago

i have used this config

TRAIN: ENABLE: True DATASET: mydata BATCH_SIZE: 1 # Increased from 2 for better efficiency and learning EVAL_PERIOD: 5 CHECKPOINT_PERIOD: 5 AUTO_RESUME: True CHECKPOINT_EPOCH_RESET: True

CHECKPOINT_FILE_PATH: "/home/mukhan/slowfast/models/datasets/New_Data/output/checkpoints/checkpoint_epoch_00001.pyth"

CHECKPOINT_FILE_PATH: "/home/mukhan/project/slowfast/MViTv2_B_32x3_k400_f304025456.pyth"

CHECKPOINT_TYPE: 'caffe2'

CHECKPOINT_CLEAR_NAME_PATTERN: '(?:module.backbone.|module.|backbone.)?'

CHECKPOINT_IN_INIT: True

DATA: USE_OFFSET_SAMPLING: True DECODING_BACKEND: torchvision
NUM_FRAMES: 32 SAMPLING_RATE: 1 TRAIN_JITTER_SCALES: [256, 320] TRAIN_CROP_SIZE: 224 TEST_CROP_SIZE: 224 INPUT_CHANNEL_NUM: [3] PATH_TO_DATA_DIR: "/home/mukhan/project/slowfast/data/Mydata/" # csv files location from which it will get this, path_to_video_1 label_1 TRAIN_JITTER_SCALES_RELATIVE: [0.08, 1.0] TRAIN_JITTER_ASPECT_RELATIVE: [0.75, 1.3333]

MVIT: ZERO_DECAY_POS_CLS: False USE_ABS_POS: False REL_POS_SPATIAL: True REL_POS_TEMPORAL: True DEPTH: 24 NUM_HEADS: 1 EMBED_DIM: 96 PATCH_KERNEL: (3, 7, 7) PATCH_STRIDE: (2, 4, 4) PATCH_PADDING: (1, 3, 3) MLP_RATIO: 4.0 QKV_BIAS: True DROPPATH_RATE: 0.3 NORM: "layernorm" MODE: "conv" CLS_EMBED_ON: True DIM_MUL: [[2, 2.0], [5, 2.0], [21, 2.0]] HEAD_MUL: [[2, 2.0], [5, 2.0], [21, 2.0]] POOL_KVQ_KERNEL: [3, 3, 3] POOL_KV_STRIDE_ADAPTIVE: [1, 8, 8] POOL_Q_STRIDE: [ [0, 1, 1, 1], [1, 1, 1, 1], [2, 1, 2, 2], [3, 1, 1, 1], [4, 1, 1, 1], [5, 1, 2, 2], [6, 1, 1, 1], [7, 1, 1, 1], [8, 1, 1, 1], [9, 1, 1, 1], [10, 1, 1, 1], [11, 1, 1, 1], [12, 1, 1, 1], [13, 1, 1, 1], [14, 1, 1, 1], [15, 1, 1, 1], [16, 1, 1, 1], [17, 1, 1, 1], [18, 1, 1, 1], [19, 1, 1, 1], [20, 1, 1, 1], [21, 1, 2, 2], [22, 1, 1, 1], [23, 1, 1, 1], ] DROPOUT_RATE: 0.0 DIM_MUL_IN_ATT: True RESIDUAL_POOLING: True

AUG: NUM_SAMPLE: 2 ENABLE: True COLOR_JITTER: 0.4 AA_TYPE: rand-m7-n4-mstd0.5-inc1 INTERPOLATION: bicubic RE_PROB: 0.25 RE_MODE: pixel RE_COUNT: 1 RE_SPLIT: False

MIXUP: ENABLE: True ALPHA: 0.8 CUTMIX_ALPHA: 1.0 PROB: 1.0 SWITCH_PROB: 0.5 LABEL_SMOOTH_VALUE: 0.1

SOLVER: ZERO_WD_1D_PARAM: True BASE_LR_SCALE_NUM_SHARDS: True CLIP_GRAD_L2NORM: 1.0 BASE_LR: 0.00001 COSINE_AFTER_WARMUP: True COSINE_END_LR: 1e-6 WARMUP_START_LR: 1e-6 WARMUP_EPOCHS: 30.0 LR_POLICY: cosine MAX_EPOCH: 100 MOMENTUM: 0.9 WEIGHT_DECAY: 0.05 OPTIMIZING_METHOD: adamw

MODEL: NUM_CLASSES: 15 ARCH: mvit MODEL_NAME: MViT LOSS_FUNC: soft_cross_entropy DROPOUT_RATE: 0.5

TEST: ENABLE: False DATASET: mydata BATCH_SIZE: 64 NUM_SPATIAL_CROPS: 1 NUM_ENSEMBLE_VIEWS: 5

DATA_LOADER: NUM_WORKERS: 8 PIN_MEMORY: True

NUM_GPUS: 1 NUM_SHARDS: 1 RNG_SEED: 0 OUTPUT_DIR: "/home/mukhan/project/slowfast/output"

TENSORBOARD: ENABLE: True LOG_DIR: "/home/mukhan/project/slowfast/output/runs" # Leave empty to use cfg.OUTPUT_DIR/runs-{cfg.TRAIN.DATASET} as path. CLASS_NAMES_PATH: "/home/mukhan/project/slowfast/data/Mydata/classnames.json" # Path to json file providing class_name - id mapping. CONFUSION_MATRIX: ENABLE: True SUBSET_PATH: "/home/mukhan/project/slowfast/data/Mydata/classnames.txt" # Path to txt file contains class names separated by newline characters.

Only classes in this file will be visualized in the confusion matrix.

facebookresearch / SlowFast

Failed to load video after 10 retries #664

CHECKPOINT_FILE_PATH: "/home/mukhan/slowfast/models/datasets/New_Data/output/checkpoints/checkpoint_epoch_00001.pyth"

CHECKPOINT_TYPE: 'caffe2'

CHECKPOINT_CLEAR_NAME_PATTERN: '(?:module.backbone.|module.|backbone.)?'

Only classes in this file will be visualized in the confusion matrix.