google-research / bert

TensorFlow code and pre-trained models for BERT
https://arxiv.org/abs/1810.04805
Apache License 2.0
38.16k stars 9.6k forks source link

Always blank answer after v2.0 fine tuning training #436

Open ghost opened 5 years ago

ghost commented 5 years ago

I trained BERT for SQUAD 2.0. model.ckpt-10859 checkpoints generated which I mentioned as initial checkpoint for predictions.
However, now for any question the answer is blank. For same questions BERT 1.1 trained version is giving answers. What can go wrong.?
Below is sample output:

head nbest_predictions.json
{
    "56be4db0acb8001400a502ec007": [
        {
            "text": "",
            "probability": 0.9999999929023586,
            "start_logit": 6.828877925872803,
            "end_logit": 6.641528606414795
        },

And below is commands I ran for pretraining:

export BERT_BASE_DIR=gs://mybucket1/squad_base
export SQUAD_DIR=/content/
export OUT_DIR=gs://mybucket1/squad_base/out2
python run_squad.py --vocab_file=$BERT_BASE_DIR/vocab.txt --bert_config_file=$BERT_BASE_DIR/bert_config.json --init_checkpoint=$BERT_BASE_DIR/bert_model.ckpt --do_train=True  --train_file=$SQUAD_DIR/train-v2.0.json  --do_predict=True --predict_file=$SQUAD_DIR/dev-v2.0.json   --train_batch_size=24 --learning_rate=3e-5 --num_train_epochs=2.0 --max_seq_length=384 --doc_stride=128   --output_dir=$OUT_DIR --use_tpu=True --version_2_with_negative=True --tpu_name=grpc://10.87.231.2:8470
mittalpatel commented 5 years ago

Hello Sandeep, version_2_with_negative implies that some of the examples do not have answers. So, it considers blank as answer for some questions.

Try removing "--version_2_with_negative=True" parameter or set it to False. This should work. We had similar issue and setting this to False fixed it.

ghost commented 5 years ago

Hi Thanks for reply. However answer is there in training that is why it gave answer with 1.1

It is expected to give blank answer when there is no answer found in doc, but it gave blank for everything

On Thu, 8 Aug, 2019, 4:24 PM Mittal Patel, notifications@github.com wrote:

Hello Sandeep, version_2_with_negative implies that some of the examples do not have answers. Try removing "--version_2_with_negative=True" parameter or set it to False. This should work. We had similar issue and setting this to False fixed it.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/google-research/bert/issues/436?email_source=notifications&email_token=AHRBKIG6BOJH26UEVTXGZWLQDP3MXA5CNFSM4GXMUJL2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD33HWEY#issuecomment-519469843, or mute the thread https://github.com/notifications/unsubscribe-auth/AHRBKIAK4FU7KJTGOGQNEF3QDP3MXANCNFSM4GXMUJLQ .

yasminabelhadj commented 3 years ago

Having same issue, I replaced the start and end token indexes by -1 when there is no answer in the context, I understand that the embedding layer wouldn't accept that, but I can't find a solution to how I make it work