Closed rva5120 closed 7 years ago
Thanks for letting us know @rva5120, I didn't upgrade to 2.0 yet but I'll take a look
A general comment:
Code assumes now keras <2.0.0. Easiest solution is to add keras==1.2.2 to requirements.txt, but maybe a better solutions is to remove any dependency on keras in cleverhans? I feel that cnn_model (as well as some other functions) shipped with package would be "better off" defined in examples.
I see that dependency on keras is a bit deeper than few functions like cnn_model, but in general I am not sure this dependency is needed (or at least - it could be better separated). Anyways package defines separate functions and utilities for theano and tensorflow. To make this comment more concrete, this could be achieved by adding "theano_function_kwargs" to model_argmax, th_model_train, making download of mnist independent of keras and removing cnn_model.
If keeping keras dependency is strategic - maybe a better solution would be using functions from keras.backend? This would avoid splitted codebase in most places (except for some cornercases that require calling into given theano/tf API)
Yes, in the meanwhile I added the requirements.txt tweak in #111
I have been thinking about keras as well. We designed cleverhans with the goal of giving the tools needed to apply adversarial attacks and defenses to models on tensorflow (and optionally theano). Keras was added in the tutorials to help users experiment with different architectures easily (by modifying the few lines that define the model with Keras). But in the end, Keras is making a lot of things more complicated than they should be (e.g., with the learning phase var in the Keras backend, which needs to be fed in the dict when running Keras models with tensorflow).
We are in the process of fixing the new API to harmonize the interface of different attacks, and I'll get back to the keras issue once that process is done
I have fixed the Keras issue on my local version and I am happy to make a pull request if needed. Here are the steps I followed (please let me know if we can improve and/or is not the right approach):
1 - Change the Convolutional2D into CNNMODEL to the new keras 2.0 api (CONV2D), note that the following parameters have changed name/format:
nb_filter -> filters float kernel dimension arguments become a single tuple argument, kernel size. E.g. a legacy call Conv2D(10, 3, 3) becomes Conv2D(10, (3, 3)) kernel_size can be set to an integer instead of a tuple, e.g. Conv2D(10, 3) is equivalent to Conv2D(10, (3, 3)).
subsample -> strides. Can also be set to an integer.
border_mode -> padding
2 - Inside model_eval in utils_tf.py at the end of the for loop we have the following:
accuracy += cur_batch_size * acc_value.eval( feed_dict={x: X_test[start:end], y: Y_test[start:end], keras.backend.learning_phase(): 0})
Just added a if to ignore the remaining points in case the batch has a size that is lower than the current batch size (Not sure if this would be the right approach, but at least it worked for now):
if cur_batch_size == args.batch_size: accuracy += cur_batch_size * acc_value.eval( feed_dict={x: X_test[start:end], y: Y_test[start:end], keras.backend.learning_phase(): 0})
Thank you for the feedback @rafaelpossas
Regarding 2., we cannot ignore some test inputs when evaluating the accuracy. It's ok to repeat the inputs processed during training, but we need to ensure the accuracy returned was computed by evaluating exactly once each test input.
I get this when I run the unmodified version of the tutorial on CPU, with the latest keras. The error does not show up with keras 1.2