Closed davmacario closed 6 months ago
This should reduce the idle time of the devices (as we would be adding more computation). It would also allow to compare the 2-node and 3-node cases over the same amount of generated tokens.
This should reduce the idle time of the devices (as we would be adding more computation). It would also allow to compare the 2-node and 3-node cases over the same amount of generated tokens.