monologg / JointBERT

Pytorch implementation of JointBERT: "BERT for Joint Intent Classification and Slot Filling"
Apache License 2.0
662 stars 186 forks source link

issue in training #4

Closed 008karan closed 4 years ago

008karan commented 4 years ago

looks like i m able to train but model is not being save and not able to perform evaluation. training comand:

python3 main.py --task atis   --model_type albert       --model_dir atis_out  --do_train --do_eval

model_dir is empty after training and hence it cant find model in it.

03/20/2020 21:35:14 - INFO - trainer -   ***** Running training *****
03/20/2020 21:35:14 - INFO - trainer -     Num examples = 4478
03/20/2020 21:35:14 - INFO - trainer -     Num Epochs = 1
03/20/2020 21:35:14 - INFO - trainer -     Total train batch size = 64
03/20/2020 21:35:14 - INFO - trainer -     Gradient Accumulation steps = 1
03/20/2020 21:35:14 - INFO - trainer -     Total optimization steps = 70
Iteration: 100%|███████████████████████████████████████████████████████████████████████████████████████████████| 70/70 [00:51<00:00,  1.36it/s]
Epoch: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:51<00:00, 51.52s/it]
Traceback (most recent call last):
  File "/home/gamut/anaconda2/envs/xyz/hugging/lib/python3.6/site-packages/transformers/configuration_utils.py", line 221, in get_config_dict
    resume_download=resume_download,
  File "/home/gamut/anaconda2/envs/xyz/hugging/lib/python3.6/site-packages/transformers/file_utils.py", line 245, in cached_path
    raise EnvironmentError("file {} not found".format(url_or_filename))
OSError: file atis_out/config.json not found

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/gamut/Downloads/JointBERT-master/trainer.py", line 228, in load_model
    self.bert_config = self.config_class.from_pretrained(self.args.model_dir)
  File "/home/gamut/anaconda2/envs/xyz/hugging/lib/python3.6/site-packages/transformers/configuration_utils.py", line 176, in from_pretrained
    config_dict, kwargs = cls.get_config_dict(pretrained_model_name_or_path, **kwargs)
  File "/home/gamut/anaconda2/envs/xyz/hugging/lib/python3.6/site-packages/transformers/configuration_utils.py", line 241, in get_config_dict
    raise EnvironmentError(msg)
OSError: Model name 'atis_out' was not found in model name list. We assumed 'atis_out/config.json' was a path, a model identifier, or url to a configuration file named config.json or a directory containing such a file but couldn't find any such file at this path or url.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "main.py", line 81, in <module>
    main(args)
  File "main.py", line 22, in main
    trainer.load_model()
  File "/home/gamut/Downloads/JointBERT-master/trainer.py", line 236, in load_model
    raise Exception("Some model files might be missing...")
Exception: Some model files might be missing...
monologg commented 4 years ago

Hi!

As I see the console log, you've change the batch size from 16 to 64, and epoch from 10 to 1, so the total steps become 70. But in main.py, --save_steps is 200, which is smaller than 70, so the model wasn't saved. I recommend you to change --save_steps option:)

monologg commented 4 years ago

@008karan

I'll close this issue. If you have any additional issues, feel free to raise new issue:)