Open daemon opened 6 years ago
Need to have the following:
Subtracting idle power draw:
MoS with dynamic evaluation, after subtracting idle power draw:
Timing 53: 2.6775918006896973, total time: 145.7034273147583
Joules: 388.7 Peak: 5.3
MoS pruned results:
MoS final pruned results:
MoS pruned power results:
~40% reduction in time, ~33% reduction in power
PPL * J/q (lower is better)
PTB results
294.67 mJ/q, 223 ms/q
295.81 mJ/q, 224 ms/q
252.1 mJ/q, 188 ms/q
207.2 mJ/q, 145 ms/q
176.63 mJ/q, 125.5 ms/q
116.33 mJ/q, 83.35 ms/q
WikiText-2 results
389.69 mJ/q, 301.41 ms/q
347.07 mJ/q, 266.18 ms/q
287.02 mJ/q, 213.18 ms/q
242.18 mJ/q, 184.93 ms/q
166.33 mJ/q, 120.99 ms/q
Across 300 queries, without subtracting idle power:
MoS: reproduced STOA perplexity
Latency UI test: http://rocketeer.net/test