Closed snova-amitk closed 1 month ago
hey @snova-amitk the calculation is not working now. Could you test with this script and see if that makes sense please?
import numpy as np
total_wait_time_ttft = (df_summary['Total number of requests']/df_summary.index*df_summary['Avg. server TTFT (s)']).sum()
df_summary['num_executed'] = np.ceil(df_summary['Total number of requests'] / df_summary.index)
df_summary['output_tokens'] = df_summary['Total output tokens']/df_summary['Total number of requests']
total_generation_time = (df_summary['num_executed']*df_summary['output_tokens']/df_summary['Avg. server tokens per sec per request']).sum()
print(f'Total wait time due to ttft (mins) = {total_wait_time_ttft/60:,.4f}')
print(f'Total generation time due (mins) = {total_generation_time/60:,.4f}')
print(f'Total time (mins) = {(total_wait_time_ttft + total_generation_time)/60:,.4f}')
@snova-rodrigom Yes this calculation is right.
@snova-rodrigom : Did the merge, if you could please check this works? Thanks!