Closed Kipok closed 3 months ago
Adding more optimal parameters for training llama3 8B model + fixing a bug with incorrect timeout
Let me merge this one as it contains some important fixes. We can follow up with another PR if there are any problems.
Adding more optimal parameters for training llama3 8B model + fixing a bug with incorrect timeout