Describe the bug
Using a custom dataset without labels (therefore using solo.data.pretrain_dataloader.CustomDatasetWithoutLabels) with 0<data.data_fraction<1 config option results in the following error:
Traceback (most recent call last):
File "/workspace/solo-learn/main_pretrain.py", line 146, in main
train_dataset = prepare_datasets(
File "/workspace/solo-learn/solo/data/pretrain_dataloader.py", line 355, in prepare_datasets
data = train_dataset.samples
AttributeError: 'DatasetWithIndex' object has no attribute 'samples'
As far as I can tell if I use a custom dataset with no_labels: True, DatasetWithIndex is subclassed from CustomDatasetWithoutLabels, so the attribute error is essentially AttributeError: 'CustomDatasetWithoutLabels' object has no attribute 'samples'
To ReproduceIn this gist, I provided a modified version of scripts/pretrain/custom/byol.yaml that uses data.datafraction: 0.5.
To reproduce the bug, copy this to scripts/pretrain/custom and run:
Additional comments
Looking at /solo/data/pretrain_dataloader.py#L353-L364 and the definition of CustomDatasetWithoutLabels the cause of the issue is clear, and can be easily fixed by modifying /solo/data/pretrain_dataloader.py#L353-L364 to use train_dataset.images instead of train dataset.samples if the dataset is an instance of CustomDatasetWithoutLabels. I'll provide such fix in a PR.
Describe the bug Using a custom dataset without labels (therefore using
solo.data.pretrain_dataloader.CustomDatasetWithoutLabels
) with 0<data.data_fraction<1 config option results in the following error:As far as I can tell if I use a custom dataset with
no_labels: True
, DatasetWithIndex is subclassed from CustomDatasetWithoutLabels, so the attribute error is essentiallyAttributeError: 'CustomDatasetWithoutLabels' object has no attribute 'samples'
To Reproduce In this gist, I provided a modified version of
scripts/pretrain/custom/byol.yaml
that usesdata.datafraction: 0.5
. To reproduce the bug, copy this toscripts/pretrain/custom
and run:Screenshots No screenshots required.
Versions solo-learn == 1.0.6 torch==1.13.1 pytorch-lightning==1.6.4
Additional comments Looking at /solo/data/pretrain_dataloader.py#L353-L364 and the definition of CustomDatasetWithoutLabels the cause of the issue is clear, and can be easily fixed by modifying /solo/data/pretrain_dataloader.py#L353-L364 to use train_dataset.images instead of train dataset.samples if the dataset is an instance of CustomDatasetWithoutLabels. I'll provide such fix in a PR.