Open WardLT opened 8 months ago
The data loaders for Difflinker place the entire training set on device memory, which limits the size of datasets we can train on. We should instead change the train_step to move each batch to the device only when needed
train_step
The data loaders for Difflinker place the entire training set on device memory, which limits the size of datasets we can train on. We should instead change the
train_step
to move each batch to the device only when needed