Open luketerry0 opened 1 month ago
Refactor to use torchrun and distribute over multiple nodes
Don't forget that images will have to be sent .to(device)
to be used with a model on the gpu
.append(batch)
to a list in each rank and then stack those batches together when you save them out at the end. Make sure to also save the list of file names in the same order.
Process images in batches, being careful to record which embeddings correspond to which filenames