utterworks / fast-bert

Super easy library for BERT based NLP models
Apache License 2.0
1.85k stars 342 forks source link

Issue with finetuning pretraining Language model using container on sagemaker #270

Open VamsiKrishnaPenumadu opened 3 years ago

VamsiKrishnaPenumadu commented 3 years ago

Hi Iam using the container_lm code base to finetune the language model on my domain.

training_config = { "run_text": "toxic comments", "finetuned_model": None, "do_lower_case": "True", "train_file": "train.txt", "val_file": "val.txt", "grad_accumulation_steps": "1", "fp16_opt_level": "O1", "fp16": "True", "model_type": "roberta", "model_name": "roberta-base", "logging_steps": "10000", "line_by_line" : True, "mlm": False, "random_state" : "7" }

hyperparameters = { "epochs": 2, "lr": 8e-5, "max_seq_length": 512, "train_batch_size": 16, "lr_schedule": "warmup_cosine", "warmup_steps": 1000, "optimizer_type": "adamw", }

I am facing an issue with mlm flag.

UnexpectedStatusException: Error for Training job bert-ner-2020-10-21-13-34-52-497: Failed. Reason: AlgorithmError: Exception during training: BERT and RoBERTa-like models do not have LM heads but masked LM heads. They must be run using the--mlm flag (masked language modeling). Traceback (most recent call last): File "/opt/ml/code/train", line 280, in train "BERT and RoBERTa-like models do not have LM heads but masked LM heads. They must be run using the" ValueError: BERT and RoBERTa-like models do not have LM heads but masked LM heads. They must be run using the--mlm flag (masked language modeling).

I know that roberta just uses masked lm objective. Even though iam providing mlm==True iam facing this issue!!..