krisrs1128 / clouds_dist

Simulation of low-clouds, from weather measures.
4 stars 1 forks source link

Why is there a data loader in rescale #59

Closed vict0rsch closed 5 years ago

vict0rsch commented 5 years ago

https://github.com/krisrs1128/clouds_dist/blob/f937ecb3693de0ea66e38a187ac4899bf6c71244/src/preprocessing.py#L16

I don't quite follow @mustafaghali because in train.py we have

        transfs = []
        if self.opts.data.preprocessed_data_path is None and self.opts.data.with_stats:
            transfs += [
                Rescale(
                    data_path=self.opts.data.path,
                    batch_size=self.opts.train.batch_size,
                    num_workers=self.opts.data.num_workers,
                    verbose=1,
                )
            ]

        self.trainset = EarthData(
            self.opts.data.path,
            preprocessed_data_path=self.opts.data.preprocessed_data_path,
            load_limit=self.opts.data.load_limit or -1,
            transform=transforms.Compose(transfs),
        )

so why does rescale have a data_loader attribute?

Also note I deleted batchsize=n_in_mem and switched to opts.train.batch_size

mustafaghali commented 5 years ago

rescale needs to iterate through the data to compute the stats, I didn't want to have different implementation from the custom dataset class than the one we already have, thought it might be easier to to debug later