Do you have a plan to include metrics like latency of generation and throughput (tokens/sec) in the evaluation? I think this would be a good addition. Having these system evaluations will surely help the development and research on efficiency vs. performance
Do you have a plan to include metrics like latency of generation and throughput (tokens/sec) in the evaluation? I think this would be a good addition. Having these system evaluations will surely help the development and research on efficiency vs. performance