Open Jasonlee1995 opened 3 years ago
Hey @Jasonlee1995 thanks for reporting this.
While your code works for datasets.VOCSegmentation
it is inefficient: We have good integrity checks for few files (say the downloaded archives). The same is not true for a large folder of images / annotations. Thus, normally, we raise an error in case we encounter an extracted folder with download=True
. For example datasets.Places365
Unfortunately, neither VOC*
nor SBDataset
does this. This means they happily re-extract the archive everytime you construct it with download=True
.
IMO we should fix VOC*
and SBDataset
(better yet: any dataset that relies on folders of data) to also raise this error. @fmassa ? This will probably resolve itself after we integrate https://github.com/pytorch/pytorch/issues/49440 and we can read from archives directly.
@Jasonlee1995 In any case you should only call the dataset constructor with download=True
once.
'download=True' condition for more than 1 dataset stops code cause of shutil
I always code dataset and dataloader as below
But at this time, dealing with SBD dataset, I get stucked as below
I saw the torchvision dataset source code and documentation, I think it would be more helpful and friendly to other users if
I know it's trivial but hope someone else like me don't get suffered from :)
cc @pmeier