Closed palash04 closed 3 weeks ago
I see there is a warmup in hf_generate.py -
if args.warmup and (not done_warmup): print('Warming up...') _ = _generate(encoded_inp) done_warmup = True
What is the purpose of this?
If you are doing performance benchmarking, you generally don't want to measure the first calls, but do some warmup and then measure the steady state.
I see there is a warmup in hf_generate.py -
What is the purpose of this?