Easy and lightning fast training of 🤗 Transformers on Habana Gaudi processor (HPU)
150
stars
193
forks
source link
Performance for summarization task on BART is low after latest Transformer 4.40 upgrade #1144
Open
astachowiczhabana opened 3 months ago
System Info
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction
cd /root/optimum-habana/examples/summarization pip install -r requirements.txt PT_HPU_ENABLE_REFINE_DYNAMIC_SHAPES=1 python run_summarization.py --model_name_or_path facebook/bart-large-cnn --do_predict --predict_with_generate --dataset_name cnn_dailymail --dataset_config \"3.0.0\" --output_dir ./tst-summarization --overwrite_output_dir --per_device_eval_batch_size 2 --use_habana --use_lazy_mode --use_hpu_graphs_for_inference --gaudi_config_name Habana/t5 --ignore_pad_token_for_loss False --pad_to_max_length --num_beams 1 --generation_num_beams 1 --bf16 --ignore_eos False
Expected behavior
The quickest way to check if something is wrong is observe performance.
Before Transformer 4.40 upgrade the speed is ~3.9 it/s After Transformer 4.40 upgrade the speed is ~1.7 it/s