OpenGVLab / UniFormerV2

[ICCV2023] UniFormerV2: Spatiotemporal Learning by Arming Image ViTs with Video UniFormer
https://arxiv.org/abs/2211.09552
Apache License 2.0
291 stars 18 forks source link

加入数据集跑不通 #47

Closed 3250446980 closed 11 months ago

3250446980 commented 1 year ago

我的配置文件config.yaml: TRAIN: ENABLE: True DATASET: kinetics_sparse BATCH_SIZE: 256 EVAL_PERIOD: 1 CHECKPOINT_PERIOD: 5 AUTO_RESUME: True DATA: USE_OFFSET_SAMPLING: True DECODING_BACKEND: decord NUM_FRAMES: 8 SAMPLING_RATE: 16 TRAIN_JITTER_SCALES: [256, 320] TRAIN_CROP_SIZE: 224 TEST_CROP_SIZE: 224 INPUT_CHANNEL_NUM: [3]

PATH_TO_DATA_DIR: path-to-imagenet-dir

TRAIN_JITTER_SCALES_RELATIVE: [0.08, 1.0] TRAIN_JITTER_ASPECT_RELATIVE: [0.75, 1.3333] UNIFORMERV2: BACKBONE: 'uniformerv2_b16' N_LAYERS: 4 N_DIM: 768 N_HEAD: 12 MLP_FACTOR: 4.0 BACKBONE_DROP_PATH_RATE: 0. DROP_PATH_RATE: 0. MLP_DROPOUT: [0.5, 0.5, 0.5, 0.5] CLS_DROPOUT: 0.5 RETURN_LIST: [8, 9, 10, 11] NO_LMHRA: True TEMPORAL_DOWNSAMPLE: False AUG: NUM_SAMPLE: 1 ENABLE: True COLOR_JITTER: 0.4 AA_TYPE: rand-m7-n4-mstd0.5-inc1 INTERPOLATION: bicubic RE_PROB: 0. RE_MODE: pixel RE_COUNT: 1 RE_SPLIT: False BN: USE_PRECISE_STATS: False NUM_BATCHES_PRECISE: 200 SOLVER: ZERO_WD_1D_PARAM: True BASE_LR_SCALE_NUM_SHARDS: True BASE_LR: 4e-4 COSINE_AFTER_WARMUP: True COSINE_END_LR: 1e-6 WARMUP_START_LR: 1e-6 WARMUP_EPOCHS: 0. LR_POLICY: cosine MAX_EPOCH: 50 MOMENTUM: 0.9 WEIGHT_DECAY: 0.05 OPTIMIZING_METHOD: adamw COSINE_AFTER_WARMUP: True MODEL: NUM_CLASSES: 400 ARCH: uniformerv2 MODEL_NAME: Uniformerv2 LOSS_FUNC: cross_entropy DROPOUT_RATE: 0.5 USE_CHECKPOINT: False CHECKPOINT_NUM: [0] TEST: ENABLE: True DATASET: kinetics_sparse BATCH_SIZE: 256 NUM_SPATIAL_CROPS: 1 NUM_ENSEMBLE_VIEWS: 1 DATA_LOADER: NUM_WORKERS: 8 PIN_MEMORY: True TENSORBOARD: ENABLE: False NUM_GPUS: 8 NUM_SHARDS: 1 RNG_SEED: 0 OUTPUT_DIR: .

数据集: image

run.sh

image

通过命令bash ./exp/k400/k400_b16_f8x224/run.sh进行训练出现 image

我想问一下这个数据集需要将k400的所有数据集的视频文件都要下载下来吗,按照这样下载的标注文件可不可以? 我想要使用您的这个模型,我应该继续怎么做,另外我之后想将这个模型运用到识别动物的行为研究上,您觉得这个可行吗?效果怎么样?如果可以的话有没有制作数据集的教程?

麻烦您了,本人的代码能力很一般,也很想使用您的这个模型去实现自己的一些东西,希望您不吝赐教,谢谢您!

Andy1621 commented 1 year ago

抱歉回复晚了:

  1. 关于bug:这个主要是因为你在shell的命令中多了空行和注释,从报错中也可以看到run.sh中好几个命令没找到,可以去除run.sh中的空行和注释。
  2. 按照下载的标注格式是可以的,比如你需要用到动物行为识别中,可以准备一份训练数据和验证数据,里面按照k400的格式准备好:
    # path,label
    1.mp4,0

    注意训练文件中行为类别总数相应地调整,比如你的动作有30类,num_classes对应设置成30 https://github.com/OpenGVLab/UniFormerV2/blob/7c18fd691d42cb2d1fda801883b0a40bb5f43ff5/exp/k400/k400%2Bk710_b16_f8x224/config.yaml#L63