salesforce / factCC

Resources for the "Evaluating the Factual Consistency of Abstractive Text Summarization" paper
https://arxiv.org/abs/1910.12840
BSD 3-Clause "New" or "Revised" License
275 stars 31 forks source link

Model performance cannot be reproduced. #3

Closed Ri0S closed 3 years ago

muggin commented 4 years ago

Hello, thank you for raising this issue! Could you please provide more details?

muggin commented 3 years ago

Thread is idle. Closing.

leoribeiro commented 3 years ago

Hi @muggin,

Thank you for releasing the code.

I'm trying to train the model and reproduce the results. As far as I understood reading the paper, you used 8 GPUs where each GPU uses a batch_size = 12. Other hyperparameters: learning_rate = 2e-5, and num_train_epochs = 10.0. Is that correct?

So the command below should train a model that reproduces the results?

python -m torch.distributed.launch \
    --nproc_per_node 8 $CODE_PATH/run.py \
  --task_name $TASK_NAME \
  --do_train \
  --do_eval \
  --do_lower_case \
  --train_from_scratch \
  --data_dir $DATA_PATH \
  --model_type bert \
  --model_name_or_path $MODEL_NAME \
  --max_seq_length 512 \
  --per_gpu_train_batch_size 12 \
  --learning_rate 2e-5 \
  --num_train_epochs 10.0 \
  --evaluate_during_training \
  --eval_all_checkpoints \
  --overwrite_cache \
  --output_dir $OUTPUT_PATH/$NAME_EXECUTION/
XuelinLuu commented 2 years ago

If I want to evaluate the summary generated by my model about cnn dailymail dataset, can I use the factcc model in the README.md directly. Should I set the label of my summary ‘CORRECT’?