Experiments on including metadata such as URLs, timestamps, website descriptions and HTML tags during pretraining.
30
stars
12
forks
source link
feat: log the average of the loss rather than the value of the main process #160
Closed
SaulLu closed 2 years ago
As discussed during the meeting this week, here is the change needed to average the loss logged on WANDB on all GPUs.
I tested the change and observed a modification of the logged values: