huggingface / optimum-habana

Easy and lightning fast training of 🤗 Transformers on Habana Gaudi processor (HPU)

Apache License 2.0

153 stars 202 forks source link

What does this PR do?

We are adding log prints for compile time in our evaluation/prediction loop for all inference test cases. To get the compile time, use the --throughput_warmup_steps flag (same logic as in text_generation/run_generation.py). This will also increase the throughput numbers since it will remove the warmup time from the throughput calculation.

examples/language-modeling/run_mlm.py

***** eval metrics *****
..
eval_graph_compliation_duration =      0.949
..

examples/summarization/run_summarization.py

***** predict metrics *****
predict_gen_len                    =     5.4174
predict_graph_compliation_duration =     9.3589
...

huggingface / optimum-habana

Add warmup time and compile time log for the eval/prediction. #1489

What does this PR do?