Tensorflow graph - Githubissues

wielandbrendel commented 5 years ago

Thanks for releasing your defence! I'd love to play around with it in more detail but I have troubles converting it into standard tensorflow graph (probably because I have no experience with tensorpack). Here is how far I got:

parser = argparse.ArgumentParser()
parser.add_argument('-d', '--depth', help='ResNet depth',
                    type=int, default=50, choices=[50, 101, 152])
parser.add_argument('--arch', help='Name of architectures defined in nets.py',
                    default='ResNetDenoise')
args = parser.parse_args([])

model = getattr(nets, args.arch + 'Model')(args)

image = tf.placeholder(tf.float32, shape=(1, 224, 224, 3))

with TowerContext(tower_name='', is_training=False):
    logits = model.get_logits(image)

But now I'd like to get the session in which the logits and the pretrained weights live, which apparently is not the default session. Could you give me a quick hint how to get that session? Thanks!

ppwwyyxx commented 5 years ago

Before calling get_logits, there is also a preprocessing and a transpose you need to apply. See adv_model.py.
You can use get_model_loader('checkpoint.npz').init(session) to load the checkpoint to a session.

wielandbrendel commented 5 years ago

Ah, that's easy! I still get an exception though. When doing

from tensorpack.tfutils import get_model_loader
sess = tf.Session()
model = get_model_loader('R152-Denoise.npz').init(sess)

I get the exception:

ValueError: Trying to load a tensor of shape (7, 7, 3, 64) into the variable 'conv0/W' whose shape is (7, 7, 224, 64).

Any idea?

ppwwyyxx commented 5 years ago

ValueError: Trying to load a tensor of shape (7, 7, 3, 64) into the variable 'conv0/W' whose shape is (7, 7, 224, 64). Any idea?

This is because you did not do the transpose as I mentioned above. So it believes your image has 224 channels rather than 3 channels.

wielandbrendel commented 5 years ago

@ppwwyyxx I misunderstood how get_model_loader is to be used, thanks for the clarification! I got much further but I am still not getting the model to give me the right predictions (the class with the max logits is always 599). This is my current minimum code:

import nets
import argparse
from tensorpack import TowerContext
import tensorflow as tf

from tensorpack.tfutils import get_model_loader
from foolbox.utils import samples

sess = tf.Session()

with sess.as_default():
    parser = argparse.ArgumentParser()
    parser.add_argument('-d', '--depth', help='ResNet depth',
                        type=int, default=50, choices=[50, 101, 152])
    parser.add_argument('--arch', help='Name of architectures defined in nets.py',
                        default='ResNetDenoise')
    args = parser.parse_args([])

    model = getattr(nets, args.arch + 'Model')(args)

    image = tf.placeholder(tf.float32, shape=(1, 3, 224, 224))

    with TowerContext(tower_name='', is_training=False):
        logits = model.get_logits(image)

    model = get_model_loader('R152-Denoise.npz').init(sess)

# loop to evaluate the model on 20 Imagenet samples (provided by Foolbox)
for idx in range(20):
    sample, label = samples(index=idx)

    # preprocess image (channel and normalization to -1 and +1)
    sample = sample.transpose([0, 3, 1, 2])
    sample -= 127.5
    sample /= 127.5

    _logits = sess.run(logits, feed_dict={image : sample})

    print(_logits[0, label[0]], _logits.max())
    print(label[0], _logits.argmax())

I am probably doing something wrong with the preprocessing but it seems to match exactly what you do. Do you have any ideas (sorry for bugging you...)?

ppwwyyxx commented 5 years ago

You should've seen a long warning in your log.

args = parser.parse_args([])

This line should be args = parser.parse_args()..... And please use -d 152.

wielandbrendel commented 5 years ago

@ppwwyyxx I was running this in a jupyter notebook and so I had to call parser.parse_args([]) instead of parser.parse_args(). The wrong default argument was the problem though, now it works. Thanks a lot!

ppwwyyxx commented 5 years ago

Nice hack on jupyter!

btw, when evaluating on imagenet images, our code also uses the standard imagenet preprocessing (resize shortest edge to 256 + center crop 224). Resizing directly to 224 might have slightly worse result (but probably doesn't matter much in the context of attack/defense, though).

ppwwyyxx commented 5 years ago

Also, our model takes BGR images however your samples seem to be in RGB. This may also lead to slightly worse result.

facebookresearch / ImageNet-Adversarial-Training

Tensorflow graph #2