zhaoyue-zephyrus / AVION

[arXiv:2309.16669] Code release for "Training a Large Video Model on a Single Machine in a Day"
http://arxiv.org/abs/2309.16669
MIT License
111 stars 6 forks source link

Epic Kitchen evalution: AttributeError: 'Namespace' object has no attribute 'model' #5

Closed Bill-rao closed 6 months ago

Bill-rao commented 6 months ago

When I tried evolution only, I got an error:

Traceback (most recent call last):
  File "/media/f/AVION/scripts/main_lavila_finetune_cls.py", line 622, in <module>
    main(args)
  File "/media/f/AVION/scripts/main_lavila_finetune_cls.py", line 145, in main
    print("=> creating model: {}".format(old_args.model))
AttributeError: 'Namespace' object has no attribute 'model

My run script is as follows:

EXP_PATH=.

export PYTHONPATH=.:third_party/decord/python/

python scripts/main_lavila_finetune_cls.py \
  --root /root/h/DataSet/Kitchen/avion_dataset/video_320p_15sec/ \
  --train-metadata /root/h/DataSet/Kitchen/avion_dataset/epic-kitchens-100-annotations/EPIC_100_train.csv \
  --val-metadata /root/h/DataSet/Kitchen/avion_dataset/epic-kitchens-100-annotations/EPIC_100_validation.csv \
  --video-chunk-length 15 \
  --use-flash-attn \
  --grad-checkpointing \
  --use-fast-conv1 \
  --batch-size 64 \
  --fused-decode-crop \
  --use-multi-epochs-loader \
  --pretrain-model /root/linux/AVION/pretrainmodels/avion_finetune_cls_lavila_vitb_best.pt \
  --output-dir $EXP_PATH 2>&1 | tee $EXP_PATH/log.txt

Depending on the error, I output all the arguments of old args.

old_args = ckpt['args']
print("ckpt\n", ckpt.keys())
print("old args \n", pprint.pformat(vars(old_args)))
print("=> creating model: {}".format(old_args.model))

The output is as follows:

/root/miniconda3/envs/avion/lib/python3.10/site-packages/torchvision/transforms/_functional_video.py:6: UserWarning: The 'torchvision.transforms._functional_video' module is deprecated since 0.12 and will be removed in the future. Please use the 'torchvision.transforms.functional' module instead.
  warnings.warn(
/root/miniconda3/envs/avion/lib/python3.10/site-packages/torchvision/transforms/_transforms_video.py:22: UserWarning: The 'torchvision.transforms._transforms_video' module is deprecated since 0.12 and will be removed in the future. Please use the 'torchvision.transforms' module instead.
  warnings.warn(
Not using distributed mode
ckpt
 dict_keys(['epoch', 'state_dict', 'optimizer', 'scaler', 'best_acc1', 'args'])
old args 
 {'actions':       verb  noun
0        0     0
1        0     1
2        0    10
3        0   100
4        0   101
...    ...   ...
3801     9    93
3802     9    94
3803     9    95
3804     9    98
3805     9    99

[3806 rows x 2 columns],
 'batch_size': 64,
 'betas': (0.9, 0.999),
 'clip_length': 16,
 'clip_stride': 2,
 'cutmix': 1.0,
 'cutmix_minmax': None,
 'dataset': 'ek100_cls',
 'decode_threads': 1,
 'disable_amp': False,
 'dist_backend': 'nccl',
 'dist_url': 'env://',
 'distributed': True,
 'drop_path_rate': 0.1,
 'dropout_rate': 0.5,
 'epochs': 100,
 'eps': 1e-08,
 'eval_freq': 5,
 'evaluate': False,
 'fused_decode_crop': True,
 'gpu': 0,
 'grad_clip_norm': None,
 'local_rank': 0,
 'lr': 0.012,
 'lr_end': 4e-05,
 'lr_start': 4e-06,
 'mapping_act2n': {0: 0,
                   1: 1,
                   2: 10,
                   3: 100,
                   4: 101,
                   5: 102,
                   6: 103,
                   7: 104,
                   8: 105,
                   9: 106,
                   10: 107,
                   11: 108,
                   12: 109,
                   13: 11,
                   14: 110,
                   ......,
                   3797: 9,
                   3798: 9,
                   3799: 9,
                   3800: 9,
                   3801: 9,
                   3802: 9,
                   3803: 9,
                   3804: 9,
                   3805: 9},
 'mixup': 0.8,
 'mixup_mode': 'batch',
 'mixup_prob': 1.0,
 'mixup_switch_prob': 0.5,
 'norm_style': 'openai',
 'num_classes': 3806,
 'num_clips': 1,
 'num_crops': 1,
 'optimizer': 'sgd',
 'output_dir': 'experiments/finetune_cls_lavila_vitb/',
 'patch_dropout': 0.0,
 'pickle_filename': '',
 'pretrain_model': './experiments/pretrain_lavila_vitb/checkpoint_best.pt',
 'print_freq': 10,
 'rank': 0,
 'resume': '',
 'root': '/storage/Datasets/EPIC-KITCHENS-100/EK100_320p_15sec_30fps_libx264/',
 'seed': 0,
 'smoothing': 0.1,
 'start_epoch': 0,
 'train_metadata': 'datasets/EK100/epic-kitchens-100-annotations/EPIC_100_train.csv',
 'update_freq': 1,
 'use_fast_conv1': True,
 'use_flash_attn': True,
 'use_grad_checkpointing': True,
 'use_multi_epochs_loader': True,
 'use_zero': False,
 'val_metadata': 'datasets/EK100/epic-kitchens-100-annotations/EPIC_100_validation.csv',
 'video_chunk_length': 15,
 'warmup_epochs': 2,
 'wd': 4e-05,
 'workers': 8,
 'world_size': 8}
Traceback (most recent call last):
  File "/media/f/AVION/scripts/main_lavila_finetune_cls.py", line 621, in <module>
    main(args)
  File "/media/f/AVION/scripts/main_lavila_finetune_cls.py", line 145, in main
    print("=> creating model: {}".format(old_args.model))
AttributeError: 'Namespace' object has no attribute 'model'

It turns out that there is indeed a lack of "model"

Bill-rao commented 6 months ago

In addition to that, in the code that creates the model as follows,

model = getattr(model_clip, old_args.model)(
    freeze_temperature=True,
    use_grad_checkpointing=args.use_grad_checkpointing,
    context_length=old_args.context_length,
    vocab_size=old_args.vocab_size,
    patch_dropout=args.patch_dropout,
    num_frames=args.clip_length,
    drop_path_rate=args.drop_path_rate,
    use_fast_conv1=args.use_fast_conv1,
    use_flash_attn=args.use_flash_attn,
    use_quick_gelu=True,
    project_embed_dim=old_args.project_embed_dim,
    pretrain_zoo=old_args.pretrain_zoo,
    pretrain_path=old_args.pretrain_path,
)

In addition, in model creation code such as old_args.context_length, old_args.vocab_size, old_args.project_embed_dim, old_args.pretrain_zoo, old_args.pretrain_path , none of these values is available.

zhaoyue-zephyrus commented 6 months ago

Hi @Bill-rao ,

The training script you were using was for training. Please add the following two arguments to run --resume $SOME_PATH/avion_finetune_cls_lavila_vitb_best.pt --evaluate. I'll add it to the readme.

Best, Yue