utterworks / fast-bert

Super easy library for BERT based NLP models
Apache License 2.0
1.85k stars 342 forks source link

Using fast-bert to fine-tune pretrained BioBERT model #271

Open nleguillarme opened 3 years ago

nleguillarme commented 3 years ago

Hi.

I'd like to use fast-bert to fine-tune a BioBERT model on a NER corpus.

Here is the code I use to create a learner from a pretrained BioBERT model:

learner = BertLearner.from_pretrained_model(
    databunch,
    pretrained_path="dmis-lab/biobert-base-cased-v1.1",
    metrics=metrics,
    device=device_cuda,
    logger=logger,
    output_dir=OUTPUT_DIR,
    finetuned_wgts_path=None,
    warmup_steps=500,
    multi_gpu=True,
    is_fp16=True,
    multi_label=False,
    logging_steps=50,
)

After 10h training on 2 GPUs, the only logs I have are a bunch of WARNING:root:NaN or Inf found in input tensor.. From the tensorboard tfevents file, I can see that the valid loss is NaN...

Before trying to find out what's wrong, could you please confirm that it's actually conceptually feasible to fine-tune a BioBERT model using fast-bert ?