Closed wielandbrendel closed 5 years ago
Before calling get_logits
, there is also a preprocessing and a transpose you need to apply. See adv_model.py
.
You can use get_model_loader('checkpoint.npz').init(session)
to load the checkpoint to a session.
Ah, that's easy! I still get an exception though. When doing
from tensorpack.tfutils import get_model_loader
sess = tf.Session()
model = get_model_loader('R152-Denoise.npz').init(sess)
I get the exception:
ValueError: Trying to load a tensor of shape (7, 7, 3, 64) into the variable 'conv0/W' whose shape is (7, 7, 224, 64).
Any idea?
ValueError: Trying to load a tensor of shape (7, 7, 3, 64) into the variable 'conv0/W' whose shape is (7, 7, 224, 64).
Any idea?
This is because you did not do the transpose as I mentioned above. So it believes your image has 224 channels rather than 3 channels.
@ppwwyyxx I misunderstood how get_model_loader
is to be used, thanks for the clarification! I got much further but I am still not getting the model to give me the right predictions (the class with the max logits is always 599). This is my current minimum code:
import nets
import argparse
from tensorpack import TowerContext
import tensorflow as tf
from tensorpack.tfutils import get_model_loader
from foolbox.utils import samples
sess = tf.Session()
with sess.as_default():
parser = argparse.ArgumentParser()
parser.add_argument('-d', '--depth', help='ResNet depth',
type=int, default=50, choices=[50, 101, 152])
parser.add_argument('--arch', help='Name of architectures defined in nets.py',
default='ResNetDenoise')
args = parser.parse_args([])
model = getattr(nets, args.arch + 'Model')(args)
image = tf.placeholder(tf.float32, shape=(1, 3, 224, 224))
with TowerContext(tower_name='', is_training=False):
logits = model.get_logits(image)
model = get_model_loader('R152-Denoise.npz').init(sess)
# loop to evaluate the model on 20 Imagenet samples (provided by Foolbox)
for idx in range(20):
sample, label = samples(index=idx)
# preprocess image (channel and normalization to -1 and +1)
sample = sample.transpose([0, 3, 1, 2])
sample -= 127.5
sample /= 127.5
_logits = sess.run(logits, feed_dict={image : sample})
print(_logits[0, label[0]], _logits.max())
print(label[0], _logits.argmax())
I am probably doing something wrong with the preprocessing but it seems to match exactly what you do. Do you have any ideas (sorry for bugging you...)?
You should've seen a long warning in your log.
args = parser.parse_args([])
This line should be args = parser.parse_args()
.....
And please use -d 152
.
@ppwwyyxx I was running this in a jupyter notebook and so I had to call parser.parse_args([])
instead of parser.parse_args()
. The wrong default argument was the problem though, now it works. Thanks a lot!
Nice hack on jupyter!
btw, when evaluating on imagenet images, our code also uses the standard imagenet preprocessing (resize shortest edge to 256 + center crop 224). Resizing directly to 224 might have slightly worse result (but probably doesn't matter much in the context of attack/defense, though).
Also, our model takes BGR images however your samples seem to be in RGB. This may also lead to slightly worse result.
Thanks for releasing your defence! I'd love to play around with it in more detail but I have troubles converting it into standard tensorflow graph (probably because I have no experience with tensorpack). Here is how far I got:
But now I'd like to get the session in which the logits and the pretrained weights live, which apparently is not the default session. Could you give me a quick hint how to get that session? Thanks!