Closed cmanzo closed 9 months ago
Ah, interesting use-case. I will think about if this can be implemented. Main difficulty is that I don't think dataloader notifies the dataset when a new epoch starts (unlike tensorflow).
I see... but still, it should only use length
images, no?
My idea was that length was used to determine the number of images if not using sources. I can propose something in a bit, see if it suits your needs
In that case, let’s leave it as is. There are alternative ways to control lenght when using source. I mean, the cool thing would be to have the ‘replace’ together with the source 😉
@BenjaminMidtvedt: I was expecting this code:
train_dataset = dt.pytorch.Dataset(pipeline, inputs=source, length=1000, replace=0.2)
train_loader = DataLoader(train_dataset, batch_size=32, shuffle=True)
where source is a path to an image folder, to use 1000 images per epoch and replace them withreplace
probability. However, it seems to perform the training using all the images in the source for each epoch.