Shivanandroy / simpleT5

simpleT5 is built on top of PyTorch-lightning⚡️ and Transformers🤗 that lets you quickly train your T5 models.
MIT License
382 stars 61 forks source link

test and eval sets the same? #50

Open mgh1 opened 1 year ago

mgh1 commented 1 year ago

I noticed using the same dataset for test and eval? Why do this and not separate out an eval set?

    def test_dataloader(self):
        """test dataloader"""
        return DataLoader(
            self.test_dataset,
            batch_size=self.batch_size,
            shuffle=False,
            num_workers=self.num_workers,
        )

    def val_dataloader(self):
        """validation dataloader"""
        return DataLoader(
            self.test_dataset,
            batch_size=self.batch_size,
            shuffle=False,
            num_workers=self.num_workers,
        )

See self.test_dataset being used twice here.

ayoni02 commented 1 year ago

Is it not already separated with the train_test_split

mgh1 commented 1 year ago

Normally, there are supposed to be three datasets: train, test and eval. In SimpleT5, there seems to be confusion on calling the non-training datasets eval or test. These are supposed to be different datasets with clearly different purposes. As for your question, SimpleT5 does not further split up the eval dataset into a test and smaller eval set internally. It is not usual for there to be a duplication of eval and test sets, hence why this looks buggy.

Here you can see in the code there are 3 data loaders:

    def train_dataloader(self):
        """training dataloader"""
        return DataLoader(
            self.train_dataset,
            batch_size=self.batch_size,
            shuffle=True,
            num_workers=self.num_workers,
        )

    def test_dataloader(self):
        """test dataloader"""
        return DataLoader(
            self.test_dataset,
            batch_size=self.batch_size,
            shuffle=False,
            num_workers=self.num_workers,
        )

    def val_dataloader(self):
        """validation dataloader"""
        return DataLoader(
            self.test_dataset,
            batch_size=self.batch_size,
            shuffle=False,
            num_workers=self.num_workers,
        )

See how val_dataloader and test_dataloader are both using the same dataset (self.test_dataset)? That's either a bug, not a good ML practice, or maybe I'm missing something?