facebookresearch / SlowFast

PySlowFast: video understanding codebase from FAIR for reproducing state-of-the-art video models.
Apache License 2.0
6.59k stars 1.21k forks source link

Error when run demo with pretrained model mvit_b_16_conv #474

Open reddevil1310 opened 3 years ago

reddevil1310 commented 3 years ago

I run with command: python3 tools/run_net.py --cfg configs/Kinetics/DemoMvit_B_16x4_Conv.yaml \ DATA.PATH_TO_DATA_DIR /mnt/Ubuntu/Dataset/valid_result \ TEST.CHECKPOINT_FILE_PATH model/K400_MVIT_B_16x4_CONV.pyth \ TRAIN.ENABLE False NUM_GPUS 1

and my config file

TRAIN: ENABLE: False DATASET: kinetics BATCH_SIZE: 16 EVAL_PERIOD: 10 CHECKPOINT_PERIOD: 10 AUTO_RESUME: True DATA: USE_OFFSET_SAMPLING: True DECODING_BACKEND: torchvision NUM_FRAMES: 16 SAMPLING_RATE: 4 TRAIN_JITTER_SCALES: [256, 320] TRAIN_CROP_SIZE: 224 TEST_CROP_SIZE: 224 INPUT_CHANNEL_NUM: [3]

PATH_TO_DATA_DIR: path-to-k400-dir

TRAIN_JITTER_SCALES_RELATIVE: [0.08, 1.0] TRAIN_JITTER_ASPECT_RELATIVE: [0.75, 1.3333] MVIT: ZERO_DECAY_POS_CLS: False SEP_POS_EMBED: True DEPTH: 16 NUM_HEADS: 1 EMBED_DIM: 96 PATCH_KERNEL: (3, 7, 7) PATCH_STRIDE: (2, 4, 4) PATCH_PADDING: (1, 3, 3) MLP_RATIO: 4.0 QKV_BIAS: True DROPPATH_RATE: 0.2 NORM: "layernorm" MODE: "conv" CLS_EMBED_ON: True DIM_MUL: [[1, 2.0], [3, 2.0], [14, 2.0]] HEAD_MUL: [[1, 2.0], [3, 2.0], [14, 2.0]] POOL_KVQ_KERNEL: [3, 3, 3] POOL_KV_STRIDE_ADAPTIVE: [1, 8, 8] POOL_Q_STRIDE: [[1, 1, 2, 2], [3, 1, 2, 2], [14, 1, 2, 2]] DROPOUT_RATE: 0.0 AUG: NUM_SAMPLE: 2 ENABLE: True COLOR_JITTER: 0.4 AA_TYPE: rand-m7-n4-mstd0.5-inc1 INTERPOLATION: bicubic RE_PROB: 0.25 RE_MODE: pixel RE_COUNT: 1 RE_SPLIT: False MIXUP: ENABLE: True ALPHA: 0.8 CUTMIX_ALPHA: 1.0 PROB: 1.0 SWITCH_PROB: 0.5 LABEL_SMOOTH_VALUE: 0.1 BN: USE_PRECISE_STATS: False NUM_BATCHES_PRECISE: 200 SOLVER: ZERO_WD_1D_PARAM: True CLIP_GRAD_L2NORM: 1.0 BASE_LR_SCALE_NUM_SHARDS: True BASE_LR: 0.0001 COSINE_AFTER_WARMUP: True COSINE_END_LR: 1e-6 WARMUP_START_LR: 1e-6 WARMUP_EPOCHS: 30.0 LR_POLICY: cosine MAX_EPOCH: 200 MOMENTUM: 0.9 WEIGHT_DECAY: 0.05 OPTIMIZING_METHOD: adamw MODEL: NUM_CLASSES: 400 ARCH: mvit MODEL_NAME: MViT LOSS_FUNC: soft_cross_entropy DROPOUT_RATE: 0.5 TEST: ENABLE: False DATASET: kinetics BATCH_SIZE: 64 NUM_SPATIAL_CROPS: 1 DATA_LOADER: NUM_WORKERS: 8 PIN_MEMORY: True DEMO: ENABLE: True LABEL_FILE_PATH: model/kinetics_classnames.json # Path to json file providing class_name - id mapping. INPUT_VIDEO: /mnt/Ubuntu/Dataset/valid_result/countingmoney_kEtlPacwlHw.avi # Path to input video file. OUTPUT_FILE: /mnt/Ubuntu/Dataset/valid_result/countingmoney_kEtlPacwlHw_result.avi #Path to output video file to write results to.

Leave an empty string if you would like to display results to a window.

THREAD_ENABLE: False # Run video reader/writer in the background with multi-threading.

NUM_VIS_INSTANCES: 1 # Number of CPU(s)/processes use to run video visualizer.

NUM_CLIPS_SKIP: 4 # Number of clips to skip prediction/visualization

(mostly to smoothen/improve display quality with wecam input).

NUM_GPUS: 8 NUM_SHARDS: 1 RNG_SEED: 0 OUTPUT_DIR: .

I get this error: 0it [00:04, ?it/s] Traceback (most recent call last): File "tools/run_net.py", line 46, in main() File "tools/run_net.py", line 42, in main demo(cfg) File "/mnt/Ubuntu/PoseEstimate/FacebookReasearch/SlowFast/tools/demo_net.py", line 114, in demo for task in tqdm.tqdm(run_demo(cfg, frame_provider)): File "/mnt/Ubuntu/PoseEstimate/FacebookReasearch/SlowFast/venv/lib/python3.8/site-packages/tqdm/std.py", line 1185, in iter for obj in iterable: File "/mnt/Ubuntu/PoseEstimate/FacebookReasearch/SlowFast/tools/demo_net.py", line 79, in run_demo model.put(task) File "/mnt/Ubuntu/PoseEstimate/FacebookReasearch/SlowFast/slowfast/visualization/predictor.py", line 142, in put task = self.predictor(task) File "/mnt/Ubuntu/PoseEstimate/FacebookReasearch/SlowFast/slowfast/visualization/predictor.py", line 104, in call preds = self.model(inputs, bboxes) File "/mnt/Ubuntu/PoseEstimate/FacebookReasearch/SlowFast/venv/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1053, in _call_impl return forward_call(*input, **kwargs) TypeError: forward() takes 2 positional arguments but 3 were given

Can anyone help me fix this? Thank you very much!

liuxufenfeiya commented 3 years ago

same error i try add DETECTION,but it didn't work.

Serhii-Tiurin commented 2 years ago

You can fix this in visualization/predictor.py file, changing from preds = self.model(inputs, bboxes) to preds = self.model(inputs)

aikuniverse commented 2 years ago

same error i try add DETECTION,but it didn't work.

I have same problem with you,did you solve it?

fengjingchehu commented 7 months ago

i wonder how can i get semoe files like kinetics_classnames.json for LABEL_FILE_PATH?