TheShadow29 / VidSitu

[CVPR21] Visual Semantic Role Labeling for Video Understanding (https://arxiv.org/abs/2104.00990)
https://vidsitu.org/
MIT License
57 stars 8 forks source link

Assertion Error cfg.mdl.mdl_name == "sf_base" when running main_dist.py #12

Open yrf1 opened 3 years ago

yrf1 commented 3 years ago

I ran python main_dist.py "experiment_name" without further specifying the --arg1 and --arg2 flags but got

  quant_noise_pq: 0
  quant_noise_pq_block_size: 8
  quant_noise_scalar: 0
  share_all_embeddings: False
  share_decoder_input_output_embed: False
  tie_adaptive_weights: False
uid: experiment_name
val_dl_name: valid
Traceback (most recent call last):
  File "main_dist.py", line 172, in <module>
    fire.Fire(main_dist)
  File "/shared/nas/data/users//data_preproc/VidSitu/vsitu_pyt/lib/python3.7/site-packages/fire/core.py", line 138, in Fire
    component_trace = _Fire(component, args, parsed_flag_args, context, name)
  File "/shared/nas/data/users/data_preproc/VidSitu/vsitu_pyt/lib/python3.7/site-packages/fire/core.py", line 468, in _Fire
    target=component.__name__)
  File "/shared/nas/data/users/data_preproc/VidSitu/vsitu_pyt/lib/python3.7/site-packages/fire/core.py", line 672, in _CallAndUpdateTrace
    component = fn(*varargs, **kwargs)
  File "main_dist.py", line 160, in main_dist
    launch_job(cfg, init_method="tcp://localhost:9997", func=main_fn)
  File "/shared/nas/data/users/data_preproc/VidSitu/utils/trn_dist_utils.py", line 42, in launch_job
    func(cfg=cfg)
  File "main_dist.py", line 96, in main_fn
    learn = learner_init(uid, cfg)
  File "main_dist.py", line 34, in learner_init
    mdl_loss_eval = get_mdl_loss_eval(cfg)
  File "/shared/nas/data/users/data_preproc/VidSitu/vidsitu_code/mdl_selector.py", line 29, in get_mdl_loss_eval
    assert cfg.mdl.mdl_name == "sf_base"
AssertionError

So we should be using sf_base instead of the GPT model in config?

Also, I think it'll be cool if there can be some instructions showing how VidSitu can be applied to external data (like the data preprocessing steps and change in commands for inference etc).

TheShadow29 commented 3 years ago

@yrf1 to run the gpt2 model you need the following command:

python main_dist.py <exp_name> --train.bs=... --train.bsv=... --task_type=vb_arg --mdl.mdl_name=new_gpt2_only 

In general, you can find the command inside logs by Ctrl+F cmd

Let me know if you face any other issues.

yrf1 commented 3 years ago

Thanks, I have some follow-up questions:

1) How does the configs for calling on a pretrained verb prediction model match with the checkpoints released in this repo? The model checkpoints in EXPTs.md are pytorch checkpoints, but when I try to update the configs in configs/vsitu_mdl_cfgs/Kinetics_c2_SLOWFAST_8x8_R50.yaml to the pytorch checkpoints (which is called from configs/vsitu_cfg.yml), I run into checkpoint loading errors.

VidSitu/SlowFast/slowfast/utils/checkpoint.py", line 270, in load_checkpoint
    checkpoint["model_state"], model_state_dict_3d
KeyError: 'model_state'

I used the slow_fast_nl_r50_8x8 mdl_ep_10.pth.

I have trouble figuring the information out from looking through the log file corresponding to slow_fast_nl_r50_8x8 mdl_ep_10.pth because the "val" part in the log file shows an empty checkpoint path while the "train" part in the log file uses the caffe checkpoint which I think you guys already converted into pytorch before release.

TheShadow29 commented 3 years ago

@yrf1

So the config inside configs/vsitu_mdl_cfgs/Kinetics_c2_SLOWFAST_8x8_R50.yaml is for pre-trained Slowfast model trained over Kinetics.

But if you want to use some of our checkpoint, you should pass --train.resume='....' and --train.resume_path='/path/to/model'

Does that answer your question?

yrf1 commented 3 years ago

Hi @TheShadow29, thanks for the follow-up. So I'm still trying to apply pretrained VidSitu on my dataset. I tried running a command like python main_dist.py "experiment_name" --train.resume_path='weights/vbarg_sfastpret_txe_txd_18Feb21.pth' for the semantic role labeling task using pretrained models, but it appears that the code calls on a data/vsitu_vid_feats directory (from line 569 in vidsitu_code/dat_loader.py).

Should this have happened? If so, how should the video features be computed for running VidSitu on external data?

TheShadow29 commented 3 years ago

@yrf1 I see that I forgot to put up the feature extraction code. I will upload it within end of day. If you are in a hurry, it initializes SFBase model, and uses the head_out after permute (https://github.com/TheShadow29/VidSitu/blob/main/vidsitu_code/mdl_sf_base.py#L195) and saves in npy file.

TheShadow29 commented 3 years ago

@yrf1 The feature extraction code is up now vidsitu_code/feat_extractor.py

Instructions are provided in DATA_PREP.md inside data/

Let me know if you face any issues.

yrf1 commented 3 years ago

@TheShadow29 thank you! I'll try it out right now