problems running base train examples

gerardoalfredo2 commented 3 years ago

❓ Questions and Help

When I try to run any of the examples to train the models im receiving some messages about some configuration for the optimizers.

If I run the next CLI command:

mmf_run config=projects/hateful_memes/configs/mmbt/defaults.yaml model=mmbt dataset=hateful_memes training.log_interval=50 training.max_updat
es=3000 training.batch_size=16 training.evaluation_interval=500

I have the next warning on return after the model is loaded:

2021-04-07T20:03:59 | mmf.trainers.mmf_trainer: Loading model
Some weights of BertForPreTraining were not initialized from the model checkpoint at bert-base-uncased and are newly initialized: ['cls.predictions.decoder.bias']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
WARNING 2021-04-07T20:04:10 | py.warnings: g:\trainings\cs7643_deeplearning\projects\memes\mmf\mmf\models\base_model.py:128: UserWarning: No losses are defined in model configuration. You are exp
ected to return loss in your return dict from forward.
  "No losses are defined in model configuration. You are expected "

WARNING 2021-04-07T20:04:10 | py.warnings: 
_:\projects\memes\mmf\mmf\models\base_model.py:128: UserWarning: No losses are defined in model configuration. You are expected to return loss in your return dict from forward.
  "No losses are defined in model configuration. You are expected "

2021-04-07T20:04:12 | mmf.trainers.mmf_trainer: Loading optimizer
Traceback (most recent call last):
  File "C:\Users\Admin\Anaconda3\envs\mmf\Scripts\mmf_run-script.py", line 33, in <module>
    sys.exit(load_entry_point('mmf', 'console_scripts', 'mmf_run')())
  File "\projects\memes\mmf\mmf_cli\run.py", line 133, in run
    main(configuration, predict=predict)
  File "\projects\memes\mmf\mmf_cli\run.py", line 52, in main
    trainer.load()
  File "\projects\memes\mmf\mmf\trainers\mmf_trainer.py", line 43, in load
    super().load()
  File "\projects\memes\mmf\mmf\trainers\base_trainer.py", line 31, in load
    self.load_optimizer()
  File "\projects\memes\mmf\mmf\trainers\mmf_trainer.py", line 104, in load_optimizer
    self.optimizer = build_optimizer(self.model, self.config)
  File "\projects\memes\mmf\mmf\utils\build.py", line 252, in build_optimizer
    "Optimizer attributes must have a 'type' key "
ValueError: Optimizer attributes must have a 'type' key specifying the type of optimizer. (Custom or PyTorch)

I was trying to figure out how to set up these parameters but I can't find an example.

vedanuj commented 3 years ago

Hi, are you on master branch of MMF? When I tested it works for me without any error.

gerardoalfredo2 commented 3 years ago

No, I'm participating in the hateMeme challenge and I found that issue when I try to run any of the models.

apsdehal commented 3 years ago

I believe the problem is happening because @gerardoalfredo2 is on a windows system which would require specifying config as config=projects\hateful_memes\configs\mmbt\defaults.yaml. Since currently, it is specified in unix filesystem format, it is not picking it up.

gerardoalfredo2 commented 3 years ago

I found out that the problem is with the configuration paths in the reference documents. I changed the command to: mmf_run config=projects/mmbt/configs/hateful_memes/defaults.yaml run_type=train_val dataset=hateful_memes model=mmbt and now everything is working fine.

apsdehal commented 3 years ago

@gerardoalfredo2 Great catch. This specifically happens because those paths are symbolic links and don't get cloned as it is. You have to -c core.symlinks=True to the git command to get those symlinks on Windows. There is a good discussion in https://stackoverflow.com/questions/11662868/what-happens-when-i-clone-a-repository-with-symlinks-on-windows. We have plans to get rid of these symlinks and we will prioritize that now.

facebookresearch / mmf

problems running base train examples #861

❓ Questions and Help