Closed elcronos closed 3 years ago
Hi @elcronos!
Here are the steps to do so.
Add your custom dataset to the lib as described here.
Create a model and modify the last layer of that model, e.g.
from robustness.datasets import MyNewDataSet
from robustness.model_utils import make_and_restore_model
from torch import nn
ds = MyNewDataSet('/path/to/dataset/')
attacker_model, _ = make_and_restore_model(arch='resnet50', pytorch_pretrained=True, dataset=ds)
num_ftrs = attacker_model.model.fc.in_features num_classes = 10 # or whatever your custom dataset has
attacker_model.model.fc = nn.Linear(num_ftrs, num_classes)
3. Run adversarial training as you would do normally for cifar10 or ImageNet (e.g. [here](https://robustness.readthedocs.io/en/latest/example_usage/cli_usage.html#training-a-robust-resnet-50-for-the-restricted-imagenet-dataset)).
For more examples how to finetune using our lib, checkout our [code-base on transfer learning](https://github.com/microsoft/robust-models-transfer), which .
Hope this helps. Please let us know if you have any further questions!
Hi @Hadisalman,
Thanks for your quick response. So I tried to follow those steps but I'm still getting some errors.
This is what I added in datasets.py:
class MyDataset(DataSet):
def __init__(self, data_path,**kwargs):
self.num_classes = 1000
ds_kwargs = {
'num_classes': self.num_classes,
'mean': torch.tensor([0.4859, 0.4131, 0.3083]),
'std': torch.tensor([0.2919, 0.2507, 0.2273])
'transform_train': da.TRAIN_TRANSFORMS_IMAGENET,
'transform_test': da.TEST_TRANSFORMS_IMAGENET
}
super(MyDataset, self).__init__('mydataset', data_path, **ds_kwargs)
def get_model(self, arch, pretrained):
return imagenet_models.__dict__[arch](num_classes=self.num_classes,
pretrained=pretrained)
Then, I tried the following code for the adversarial training:
import torch
from torch import nn
from robustness.datasets import MyDataset
from robustness.model_utils import make_and_restore_model
from cox.utils import Parameters
from cox import store
from robustness import model_utils, datasets, train, defaults
ds = MyDataset('/path/to/my_model/MyDataset', batch_size=8)
m, _ = make_and_restore_model(arch='resnet50', pytorch_pretrained=True,
dataset=ds)
train_loader, val_loader = ds.make_loaders(batch_size=64, workers=8)
# Create a cox store for logging
OUT_DIR = './outputs'
out_store = store.Store(OUT_DIR)
num_ftrs = attacker_model.model.fc.in_features
num_classes = 10 # or whatever your custom dataset has
# Replace the last layer of your model with a layer that fits your custom dataset
attacker_model.model.fc = nn.Linear(num_ftrs, num_classes)
train_kwargs = {
'out_dir': "train_out",
'adv_train': 1,
'constraint': '2',
'eps': 0.5,
'attack_lr': 1.5,
'attack_steps': 20
}
train_args = Parameters(train_kwargs)
# Fill whatever parameters are missing from the defaults
train_args = defaults.check_and_fill_args(train_args,
defaults.TRAINING_ARGS, MyDataset)
train_args = defaults.check_and_fill_args(train_args,
defaults.PGD_ARGS, MyDataset)
# Train a model
train.train_model(train_args, m, (train_loader, val_loader), store=out_store)
I'm getting the following error:
<ipython-input-19-28c6fcfda123> in <module>
27
28 # Fill whatever parameters are missing from the defaults
---> 29 train_args = defaults.check_and_fill_args(train_args,
30 defaults.TRAINING_ARGS, MyDataset)
31 train_args = defaults.check_and_fill_args(train_args,
~/anaconda3/envs/pytorch-flash/lib/python3.9/site-packages/robustness/defaults.py in check_and_fill_args(args, arg_list, ds_class)
184 if arg_default == REQ: raise ValueError(f"{arg_name} required")
185 elif arg_default == BY_DATASET:
--> 186 setattr(args, name, TRAINING_DEFAULTS[ds_class][name])
187 elif arg_default is not None:
188 setattr(args, name, arg_default)
KeyError: <class 'robustness.datasets.MyDataset'>
Also, in the code above. Could you please indicate how can I customize my PGD attack with these parameters:
ATTACK_EPS = 0.05 ATTACK_STEPSIZE = 0.01 ATTACK_STEPS = 100 TARGETED = True CUSTOM_LOSS = None
I've been copying some examples and modifying to code to adjusted to my dataset but I'm getting a bit confudes with the API.
@elcronos you need to add the training defaults of your dataset inside defaults.py
similar to below
TRAINING_DEFAULTS = {
datasets.MyDataset: {
"epochs": 150,
"batch_size": 128,
"weight_decay":5e-4,
"step_lr": 50
},
.
.
.
}
or use existing ones, e.g.
train_args = defaults.check_and_fill_args(train_args,
defaults.TRAINING_ARGS, ImageNet)
train_args = defaults.check_and_fill_args(train_args,
defaults.PGD_ARGS, ImageNet)
Regarding adversarial training, you can specify them all in
train_kwargs = {
'out_dir': "train_out",
'adv_train': 1,
'constraint': '2',
'eps': 0.05,
'attack_lr': 0.01,
'attack_steps': 100
}
but our train.train_model
doesn't allow you to do targeted attack for adversarial training. If you want to do targeted PGD attack, you can write your own training loop similar to train.py
, and every time you do a forward pass, call
out, xadv = attacker_model(x, y_target, make_adv=True)
which returns the targeted adversarial example xadv
for the target y_target
.
Hope his helps.
Thanks again @Hadisalman,
I was able to finetune a custom model. I saw that once it finished training in the path: /outputs/8a975b65-bcfc-477b-b679-e7193f81a756 there is a 69_checkpoint.pt file. There is also checkpoint.pt.best.
So my question is now how I can load those checkpoints for inference later. I tried the code from an example:
ds = MyDataset('/path/to/dataset')
model, _ = make_and_restore_model(arch='resnet50', dataset=ds,
resume_path='./outputs/8a975b65-bcfc-477b-b679-e7193f81a756/69_checkpoint.pt')
I'm getting the Error:
RuntimeError: Error(s) in loading state_dict for AttackerModel:
size mismatch for model.fc.weight: copying a param with shape torch.Size([10, 2048]) from checkpoint, the shape in current model is torch.Size([1000, 2048]).
size mismatch for model.fc.bias: copying a param with shape torch.Size([10]) from checkpoint, the shape in current model is torch.Size([1000]).
size mismatch for attacker.model.fc.weight: copying a param with shape torch.Size([10, 2048]) from checkpoint, the shape in current model is torch.Size([1000, 2048]).
size mismatch for attacker.model.fc.bias: copying a param with shape torch.Size([10]) from checkpoint, the shape in current model is torch.Size([1000]).
The problem seems to be the mismatch between the state_dict of my custom model which has 10 outputs in the last layer and the original resnet50 which has 1000. How can I modify the code so it loads my custom weights? I also noticed that the files in outputs contains much more information than I need. Is there any way to save and load only the model without the other training parameters in a .pt file?
Hi @elcronos!
Hi @andrewilyas,
Thanks for your prompt response. Maybe I was not clear enough.
I understand the part of changing the code:
ds = MyDataset('/path/to/dataset', batch_size=8)
linf_pgd_resnet, _ = make_and_restore_model(arch='resnet50', pytorch_pretrained=False,dataset=ds)
num_ftrs = linf_pgd_resnet.model.fc.in_features
num_classes = 10
linf_pgd_resnet.model.fc = nn.Linear(num_ftrs, num_classes)
But then my question is: how can I correctly load the weights of the model.? In vanilla Pytorch I would usually do something like this:
checkpoint = torch.load('./outputs/8a975b65-bcfc-477b-b679-e7193f81a756/69_checkpoint.pt')
linf_pgd_resnet.load_state_dict(checkpoint['model'])
But in this case, it seems that the format that the model use is incompatible and I get this error:
RuntimeError: Error(s) in loading state_dict for AttackerModel:
Missing key(s) in state_dict: "normalizer.new_mean", "normalizer.new_std", "model.conv1.weight", "model.bn1.weight", "model.bn1.bias", "model.bn1.running_mean", "model.bn1.running_var", "model.layer1.0.conv1.weight", "model.layer1.0.bn1.weight", "model.layer1.0.bn1.bias"
So clearly the model saved with this library does it in a different way than the torchvision models and that makes the keys incompatible. Is there any way I could solve this problem? How can I properly load the the state_dict of the model saved with the robustness library?
I hope it's more clear now my question.
Hi, it seems like your definition of MyDataset looks like:
class MyDataset(DataSet):
def __init__(self, data_path,**kwargs):
self.num_classes = 1000
and so the library expects the checkpoint to have 1000 class output. Try changing this to 10. Also, please see the repository that @Hadisalman linked (the robust-models-transfer repository), which covers how to do this in more depth.
I see in the examples that there are some specific ways to load datasets such as ImageNet and CIFAR. However, I have a custom dataset with 10 labels with a directory structure with train/test folders. How can I finetune a pretrained model such as ResNet50, change the head so it has 10 outputs in the last layer and adversarially train it using this library?