Closed gle-bellier closed 2 months ago
I'm currently re-writing the preprocessing (see branch #feature/tiling) with the assumption that all data is multi-temporal, single temporal data has just a size of 1 in the time dimension. I think that should mostly solve this problem.
Ok great! So every output of the dataset must be of the shape (T, C, H, W) right? There is still the case where the dataset outputs both a single optical image and optical time series (e.g. PASTIS has single SPOT images and s2 time series). Nevertheless, I'm not sure we want to experiment with both at the same time.
I don't think that the models selected so far can experiment with both of the optical modalities. So for the moment, we can go with the proposed solution
Ok great! So every output of the dataset must be of the shape (T, C, H, W) right? There is still the case where the dataset outputs both a single optical image and optical time series (e.g. PASTIS has single SPOT images and s2 time series). Nevertheless, I'm not sure we want to experiment with both at the same time.
They should be (C,T,H,W) or (C,H,W). The preprocessor just unsqueezes images if they have only 3 dimensions.
Actually, I just pushed a quick change, there was no reason for the data preprocessor to even care about the multi_temporal setting anymore.
For now, the data processing depends on a dataset variable in the config file:
cfg.dataset.multi_temporal
. https://github.com/yurujaja/geofm-bench/blob/ac57d905d8d6c3546c96664ecba422a450e17124/engine/data_preprocessor.py#L98This behavior does not allow the use of both single images and time series at the same time (e.g. a s2 time series and a single SAR image).
From my point of view, the available options are:
Keep this behavior and adapt datasets: if your dataset is multi-temporal (according to the
cfg.dataset.multi_temporal
value) then its output is composed only of time series (else, only single images). It is restrictive but we can argue it is the wanted behavior.Remove this dependency and allow the dataset to output time series and single images. This is more flexible, nevertheless, does it match the experiment's needs? If we plan multi-temporal, multi-model training does it mean only time-series or potentially a mix of time-series and images depending on the modality? As @VMarsocci mentioned, we also need to modify the data processor. I think the easiest is to consider output dict keys dedicated to time series, namely "images-ts" and "sar-ts" and modify the data processors accordingly. Thus it facilitates the preprocessing and also allows certain datasets to output both single optical images and time series of optical images (e.g. PASTIS dataset)