WangYueFt / dcp

346 stars 90 forks source link

Test dataset is used for model selection #26

Open gitouni opened 2 years ago

gitouni commented 2 years ago

Thanks for your contribution and sharing your implementation.

However, in this repo, the test dataloader is used for choosing the best model, but the best model is also used for testing. A more strict way for model selection is to split validation dataset from the training part because test dataset should be completely unseen before testing.

Some relative codes are shown as follows:

class ModelNet40(Dataset):
    def __init__(self, num_points, partition='train', gaussian_noise=False, unseen=False, factor=4):
        self.data, self.label = load_data(partition)
        self.num_points = num_points
        self.partition = partition
        self.gaussian_noise = gaussian_noise
        self.unseen = unseen
        self.label = self.label.squeeze()
        self.factor = factor
        if self.unseen:
            ######## simulate testing on first 20 categories while training on last 20 categories
            if self.partition == 'test':
                self.data = self.data[self.label>=20]
                self.label = self.label[self.label>=20]
            elif self.partition == 'train':
                self.data = self.data[self.label<20]
                self.label = self.label[self.label<20]

The partition parameter can only be set as 'test' or 'train' and partition=test is used for evaluating and selecting the best model for training.

Hope for your reply, cheers.