Question about preprocessing parameter when using benchmark

I used to train my model with such data preprocessing:

transform = transforms.Compose([
        transforms.ToTensor(),
        transforms.Normalize((0.4914, 0.4822, 0.4465), (0.2471, 0.2435, 0.2616)),
    ])
trainset = torchvision.datasets.CIFAR10(root='../data', train=True, download=True, transform=transform)
train_loader = torch.utils.data.DataLoader(trainset, batch_size=config['batch_size'], shuffle=True,
                                                   drop_last=True, num_workers=8, pin_memory=True)

During testing, I use the following way to evaluate the trained model, and it goes well:

clean_acc, robust_acc = benchmark(model, n_examples=10000, dataset="cifar10", data_dir='../data',
                                          threat_model="Linf", eps=8/255, device=torch.device("cuda"),
                                          batch_size=1000, log_path=log_path, preprocessing=transform)

But when I try to use the the format like the models in the leaderboard, i.e., put the preprocessing inside of the model:

class My2023RN18(ResNet):
    def __init__(self):
        super(My2023RN18, self).__init__(BasicBlock, [2, 2, 2, 2])
        self.mu = torch.Tensor([0.4914, 0.4822, 0.4465]).float().view(3, 1, 1).cuda()
        self.sigma = torch.Tensor([0.2471, 0.2435, 0.2616]).float().view(3, 1, 1).cuda()
    def forward(self, x):
        x = (x - self.mu) / self.sigma
        return super(My2023RN18, self).forward(x)
clean_acc, robust_acc = benchmark(model, n_examples=10000, dataset="cifar10", data_dir='../data',
                                          threat_model="Linf", eps=8/255, device=torch.device("cuda"),
                                          batch_size=1000, log_path=log_path)

I find these two ways can have same clean accuracy, but the second one suffers a drop on robust accuracy. I want to know whether it is ok to submit a new model with the above first way of data preprocessing.

RobustBench / robustbench

Question about preprocessing parameter when using benchmark #155