Closed gle-bellier closed 2 weeks ago
I've encountered similar problems with another multi-modality dataset, CropTypeMapping. The error occurs when the dataset config specifies only one modality, but the dataset file returns all modalities (optical and car). Indeed, as you pointed out, we want to maintain independence between the dataset config and the FM config. In addition, we want the dataset modalities that are used are specified in the dataset config, rather than modifying the dataset.py file. To me a quick way to incorporate a condition across all preprocessors and augmentations. For example: Before:
for k, v in data["image"].items():
if k not in self.ignore_modalities:
data["image"][k] = T.Resize(self.size)(v)
After:
for k, v in data["image"].items():
if k not in self.ignore_modalities and k in self.dataset_cfg.bands:
data["image"][k] = T.Resize(self.size)(v)
And the same for the preprocessor:
for k, v in data["image"].items():
if k in self.dataset_cfg.bands:
data["image"][k] = self.preprocessor[k](v)
This resolves the issue for me at least. This bug should not impact datasets that only use a single modality. But we need to fix it for multi-modality dataset.
In my case, the dataset config mentions both modalities (optical and sar) and still gets this issue. I think we should preprocess only bands and modalities that are in the encoder.input_bands
EDIT: this specific case was resolve by merging the main branch into mine
I'm using the PASTIS dataset which outputs s2 and sar time series with the following data augmentations:
TypeError: Tensor is not a torch image.
I'm looking for a quick workaround but this needs to be discussed because we want the dataset configs to be independent of the FM configs and it is the case now (regarding the modalities but also the data augmentation/processing is dependent on both the size of the dataset and FM input).