Run all tests with corrupted dataset

adidigit commented 2 years ago

@naama-alon https://github.com/hendrycks/robustness

naama-alon commented 2 years ago

Mixup

Run mixup with corrupted data: python train.py -> data:cifar10C, test:cifar10C , model:ResNet18 --lr=0.1 --seed=20170922 --decay=1e-4 --model=ResNet18 --batch-size=128 --epoch=200 --no-augment=True --alpha=1 (dataset=cifar10) (testset=cifar10C)

I had to make some changes in the code because I ran in Windows.

199,0.7394752228298248,0.0,tensor(75.3706),0.6189686834812165,tensor(83.9222),tensor(16.0778)

Results: epoch=200 train loss=0.7394752228298248 reg loss=0.0 train acc=75.3706 test loss = 0.6189686834812165 test acc=83.9222 test error=16.0778

Run for 5.1992 hours in Windows.

naama-alon commented 2 years ago

Mixup

Run mixup with corrupted data: python train.py -> data:cifar100C, test:cifar100C , model:ResNet101 --lr=0.1 --seed=20170922 --decay=1e-4 --model=ResNet101 --batch-size=128 --epoch=200 --no-augment=True --alpha=1 (dataset=cifar100) (testset=cifar100C)

I had to make some changes in the code because I ran in Windows.

Results: epoch=200 train loss=1.2965549080083385 reg loss=0.0 train acc=70.3372 test loss = 1.575084633231163 test acc=63.9122 test error=36.0878

Run for 13.5176 hours in Windows.

naama-alon commented 2 years ago

Mixup

Run mixup with corrupted data: python train.py -> data:cifar100C, test:cifar100C , model:ResNet18 --lr=0.1 --seed=20170922 --decay=1e-4 --model=ResNet18 --batch-size=128 --epoch=200 --no-augment=True --alpha=1 (dataset=cifar100) (testset=cifar100C)

I had to make some changes in the code because I ran in Windows.

Results (not in paper): epoch=200 train loss=1.4879248002018683 reg loss=0.0 train acc=69.3330 test loss = 1.787755845785141 test acc=58.9721 test error=41.0279

Run for 5.1953 hours in Windows.

naama-alon commented 2 years ago

Mixup results:	alg	data	test	arch
mixup	cifar10	cifar10	resnet18	4.25
mixup	cifar10	cifar10C	resnet18	16.0778
mixup	cifar100	cifar100	resnet18	22.1800
mixup	cifar100	cifar100C	resnet18	41.0279
mixup	cifar100	cifar100	resnet101	21.1000
mixup	cifar100	cifar100C	resnet101	36.0878

-We can see that with corrupted testset all the models have bigger error than with the regular testset.

naama-alon commented 2 years ago

Cutmix

Run cutmix with corrupted data: python test.py -> data:cifar100, test:cifar100C , model:ResNet101

load checkpoint 1: resnet101CutMix/model_best.pth.tar Results 1: top1: 46.794871794871796 top5: 23.727964743589745

load checkpoint 2: resnet101CutMix/checkpoint.pth.tar Results 2: top1: 48.4775641025641 top5: 25.37059294871795

naama-alon commented 2 years ago

Cutmix

Run cutmix with corrupted data: python test.py -> data:cifar100, test:cifar100C , model:ResNet18

load checkpoint 1: CutMix/model_best.pth.tar Results 1: top1: 62.399839743589745 top5: 35.24639423076923

load checkpoint 2: CutMix/checkpoint.pth.tar Results 2: top1: 62.08934294871795 top5: 35.51682692307692

naama-alon commented 2 years ago

Cutmix results:	alg	data	test	arch	top1
cutmix	cifar100	cifar100	resnet18	36.68	12.24
cutmix	cifar100	cifar100C	resnet18	62.4	35.2464
cutmix	cifar100	cifar100	resnet101	26.71	6.7
cutmix	cifar100	cifar100C	resnet101	46.795	23.728

-We can see that with corrupted testset all the models have bigger error than with the regular testset.

adidigit commented 2 years ago

my_test 1 Accuracy (top-1 and 5 error): 70.05208333333333 42.91866987179487

my_test 2 Accuracy (top-1 and 5 error): 70.80328525641026 42.82852564102564

my_test 3 Accuracy (top-1 and 5 error): 78.4354967948718 52.68429487179487

adidigit commented 2 years ago

test 3 fixed: 62.209535256410255 32.72235576923077

adidigit commented 2 years ago

my test results:	alg	data	test	arch	top1
test 1	cifar100	cifar100C	resnet18	72	41
test 2	cifar100	cifar100C	resnet18	70	42
test 3	cifar100	cifar100C	resnet101	62	32

adidigit commented 2 years ago

alg	arch	top 1 err	top 5 err
test 1	resnet18	70	42
test 2	resnet18		26.0
test 3	resnet18	69.81	41.18

adidigit / advanced-dl-final-project

Run all tests with corrupted dataset #5