Open shahizat opened 5 days ago
Dear all,
Could you please suggest what parameter(s) increase or change when conducting multi-node training of OLMo, so that we can observe the difference between single-node training and calculate the network overhead? throughput -tokens per second?
Thank you in advance
❓ The question
Dear all,
Could you please suggest what parameter(s) increase or change when conducting multi-node training of OLMo, so that we can observe the difference between single-node training and calculate the network overhead? throughput -tokens per second?
Thank you in advance