Model performance cannot be reproduced.

muggin commented 4 years ago

Hello, thank you for raising this issue! Could you please provide more details?

muggin commented 3 years ago

Thread is idle. Closing.

leoribeiro commented 3 years ago

Hi @muggin,

Thank you for releasing the code.

I'm trying to train the model and reproduce the results. As far as I understood reading the paper, you used 8 GPUs where each GPU uses a batch_size = 12. Other hyperparameters: learning_rate = 2e-5, and num_train_epochs = 10.0. Is that correct?

So the command below should train a model that reproduces the results?

python -m torch.distributed.launch \
    --nproc_per_node 8 $CODE_PATH/run.py \
  --task_name $TASK_NAME \
  --do_train \
  --do_eval \
  --do_lower_case \
  --train_from_scratch \
  --data_dir $DATA_PATH \
  --model_type bert \
  --model_name_or_path $MODEL_NAME \
  --max_seq_length 512 \
  --per_gpu_train_batch_size 12 \
  --learning_rate 2e-5 \
  --num_train_epochs 10.0 \
  --evaluate_during_training \
  --eval_all_checkpoints \
  --overwrite_cache \
  --output_dir $OUTPUT_PATH/$NAME_EXECUTION/

XuelinLuu commented 2 years ago

If I want to evaluate the summary generated by my model about cnn dailymail dataset, can I use the factcc model in the README.md directly. Should I set the label of my summary ‘CORRECT’?

salesforce / factCC

Model performance cannot be reproduced. #3