Unable to replicate default training on CIFAR10

NicolasMICAUX commented 3 years ago

Hello, I tried python3 train.py --dataset cifar10 **--default 1** with tensorflow 1.x, on the unmodified code cloned from this repo. Training took ~3h30, but results are : Victim Model || validation accuracy: 0.8653846085071564, watermark success: 0.10416666666666667 which seems very poor performance.

Have you recently been able to run this code successfully ? Thank you for your attention.

The logs are :

WARNING:tensorflow:Entity <function _get_dataset_from_filename at 0x7f84171da950> could not be transformed and will be executed as-is. Please report this to the AutoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: module 'gast' has no attribute 'Index'
WARNING:tensorflow:Entity <function _get_dataset_from_filename at 0x7f84171da950> could not be transformed and will be executed as-is. Please report this to the AutoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: module 'gast' has no attribute 'Index'
WARNING:tensorflow:Entity <bound method TopLevelFeature.decode_example of FeaturesDict({
    'id': Text(shape=(), dtype=tf.string),
    'image': Image(shape=(32, 32, 3), dtype=tf.uint8),
    'label': ClassLabel(shape=(), dtype=tf.int64, num_classes=10),
})> could not be transformed and will be executed as-is. Please report this to the AutoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: Bad argument number for Name: 3, expecting 4
WARNING:tensorflow:Entity <bound method TopLevelFeature.decode_example of FeaturesDict({
    'id': Text(shape=(), dtype=tf.string),
    'image': Image(shape=(32, 32, 3), dtype=tf.uint8),
    'label': ClassLabel(shape=(), dtype=tf.int64, num_classes=10),
})> could not be transformed and will be executed as-is. Please report this to the AutoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: Bad argument number for Name: 3, expecting 4
2021-06-29 09:27:23.750566: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcuda.so.1
2021-06-29 09:27:23.788338: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-06-29 09:27:23.789144: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1639] Found device 0 with properties: 
name: Tesla K80 major: 3 minor: 7 memoryClockRate(GHz): 0.8235
pciBusID: 0000:00:04.0
2021-06-29 09:27:23.789449: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
2021-06-29 09:27:23.791271: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
2021-06-29 09:27:23.798880: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10
2021-06-29 09:27:23.799260: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10
2021-06-29 09:27:23.801274: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10
2021-06-29 09:27:23.809596: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10
2021-06-29 09:27:23.825124: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2021-06-29 09:27:23.825318: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-06-29 09:27:23.826240: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-06-29 09:27:23.827010: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1767] Adding visible gpu devices: 0
WARNING:tensorflow:Entity <function _get_dataset_from_filename at 0x7f84171da950> could not be transformed and will be executed as-is. Please report this to the AutoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: module 'gast' has no attribute 'Index'
WARNING:tensorflow:Entity <function _get_dataset_from_filename at 0x7f84171da950> could not be transformed and will be executed as-is. Please report this to the AutoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: module 'gast' has no attribute 'Index'
WARNING:tensorflow:Entity <bound method TopLevelFeature.decode_example of FeaturesDict({
    'id': Text(shape=(), dtype=tf.string),
    'image': Image(shape=(32, 32, 3), dtype=tf.uint8),
    'label': ClassLabel(shape=(), dtype=tf.int64, num_classes=10),
})> could not be transformed and will be executed as-is. Please report this to the AutoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: Bad argument number for Name: 3, expecting 4
WARNING:tensorflow:Entity <bound method TopLevelFeature.decode_example of FeaturesDict({
    'id': Text(shape=(), dtype=tf.string),
    'image': Image(shape=(32, 32, 3), dtype=tf.uint8),
    'label': ClassLabel(shape=(), dtype=tf.int64, num_classes=10),
})> could not be transformed and will be executed as-is. Please report this to the AutoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: Bad argument number for Name: 3, expecting 4
2021-06-29 09:27:23.979333: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2299995000 Hz
2021-06-29 09:27:23.979606: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x55bfb4472d80 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2021-06-29 09:27:23.979644: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Host, Default Version
2021-06-29 09:27:24.089919: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-06-29 09:27:24.090751: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x55bfb4472680 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2021-06-29 09:27:24.090789: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Tesla K80, Compute Capability 3.7
2021-06-29 09:27:24.090898: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1180] Device interconnect StreamExecutor with strength 1 edge matrix:
2021-06-29 09:27:24.090920: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1186]      
2021-06-29 09:27:24.093346: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1180] Device interconnect StreamExecutor with strength 1 edge matrix:
2021-06-29 09:27:24.093387: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1186]      
tcmalloc: large alloc 1228800000 bytes == 0x55bfd00da000 @  0x7f8489c661e7 0x7f8486bd746e 0x7f8486c27c7b 0x7f8486c27d18 0x7f8486ce3d79 0x7f8486ce6e4c 0x7f8486e05e7f 0x7f8486e0bfb5 0x7f8486e0de3d 0x7f8486e0f516 0x55bfb339bf30 0x55bfb339bb09 0x7f8486cee4d8 0x55bfb337e303 0x55bfb3484646 0x55bfb340c785 0x55bfb34094ae 0x55bfb34091b3 0x55bfb34d3182 0x55bfb34d34fd 0x55bfb34d33a6 0x55bfb34aa723 0x55bfb34aa3cc 0x7f8488a50bf7 0x55bfb34aa2aa
2021-06-29 09:27:35.067325: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1180] Device interconnect StreamExecutor with strength 1 edge matrix:
2021-06-29 09:27:35.067420: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1186]      
2021-06-29 09:27:35.069059: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1180] Device interconnect StreamExecutor with strength 1 edge matrix:
2021-06-29 09:27:35.069099: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1186]      
tcmalloc: large alloc 1800364032 bytes == 0x55c037306000 @  0x7f8489c661e7 0x7f8486bd746e 0x7f8486c27c7b 0x7f8486c27d18 0x7f8486ce3d79 0x7f8486ce6e4c 0x7f8486e05e7f 0x7f8486e0bfb5 0x7f8486e0de3d 0x7f8486e0f516 0x55bfb339bf30 0x55bfb339bb09 0x7f8486cee4d8 0x55bfb337e303 0x55bfb3484646 0x55bfb340c785 0x55bfb34097ad 0x55bfb339c3ea 0x55bfb340a3b5 0x55bfb34094ae 0x55bfb34091b3 0x55bfb34d3182 0x55bfb34d34fd 0x55bfb34d33a6 0x55bfb34aa723 0x55bfb34aa3cc 0x7f8488a50bf7 0x55bfb34aa2aa
WARNING:tensorflow:From train.py:111: The name tf.get_default_graph is deprecated. Please use tf.compat.v1.get_default_graph instead.

WARNING:tensorflow:From train.py:111: The name tf.get_default_graph is deprecated. Please use tf.compat.v1.get_default_graph instead.

WARNING:tensorflow:From /tensorflow-1.15.2/python3.7/tensorflow_core/python/ops/image_ops_impl.py:1518: div (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Deprecated in favor of operator or tf.math.divide.
WARNING:tensorflow:From /tensorflow-1.15.2/python3.7/tensorflow_core/python/ops/image_ops_impl.py:1518: div (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Deprecated in favor of operator or tf.math.divide.
WARNING:tensorflow:From /content/PAF/entangled_watermark_v1/models.py:71: The name tf.variable_scope is deprecated. Please use tf.compat.v1.variable_scope instead.

WARNING:tensorflow:From /content/PAF/entangled_watermark_v1/models.py:71: The name tf.variable_scope is deprecated. Please use tf.compat.v1.variable_scope instead.

WARNING:tensorflow:From /content/PAF/entangled_watermark_v1/resnet.py:26: conv2d (from tensorflow.python.layers.convolutional) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.keras.layers.Conv2D` instead.
WARNING:tensorflow:From /content/PAF/entangled_watermark_v1/resnet.py:26: conv2d (from tensorflow.python.layers.convolutional) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.keras.layers.Conv2D` instead.
WARNING:tensorflow:From /tensorflow-1.15.2/python3.7/tensorflow_core/python/layers/convolutional.py:424: Layer.apply (from tensorflow.python.keras.engine.base_layer) is deprecated and will be removed in a future version.
Instructions for updating:
Please use `layer.__call__` method instead.
WARNING:tensorflow:From /tensorflow-1.15.2/python3.7/tensorflow_core/python/layers/convolutional.py:424: Layer.apply (from tensorflow.python.keras.engine.base_layer) is deprecated and will be removed in a future version.
Instructions for updating:
Please use `layer.__call__` method instead.
WARNING:tensorflow:From /content/PAF/entangled_watermark_v1/resnet.py:126: flatten (from tensorflow.python.layers.core) is deprecated and will be removed in a future version.
Instructions for updating:
Use keras.layers.flatten instead.
WARNING:tensorflow:From /content/PAF/entangled_watermark_v1/resnet.py:126: flatten (from tensorflow.python.layers.core) is deprecated and will be removed in a future version.
Instructions for updating:
Use keras.layers.flatten instead.
WARNING:tensorflow:From /content/PAF/entangled_watermark_v1/resnet.py:33: dense (from tensorflow.python.layers.core) is deprecated and will be removed in a future version.
Instructions for updating:
Use keras.layers.Dense instead.
WARNING:tensorflow:From /content/PAF/entangled_watermark_v1/resnet.py:33: dense (from tensorflow.python.layers.core) is deprecated and will be removed in a future version.
Instructions for updating:
Use keras.layers.Dense instead.
WARNING:tensorflow:From /content/PAF/entangled_watermark_v1/models.py:176: The name tf.train.AdamOptimizer is deprecated. Please use tf.compat.v1.train.AdamOptimizer instead.

WARNING:tensorflow:From /content/PAF/entangled_watermark_v1/models.py:176: The name tf.train.AdamOptimizer is deprecated. Please use tf.compat.v1.train.AdamOptimizer instead.

WARNING:tensorflow:From /tensorflow-1.15.2/python3.7/tensorflow_core/python/ops/math_grad.py:1424: where (from tensorflow.python.ops.array_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.where in 2.0, which has the same broadcast rule as np.where
WARNING:tensorflow:From /tensorflow-1.15.2/python3.7/tensorflow_core/python/ops/math_grad.py:1424: where (from tensorflow.python.ops.array_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.where in 2.0, which has the same broadcast rule as np.where
WARNING:tensorflow:From train.py:129: The name tf.train.Saver is deprecated. Please use tf.compat.v1.train.Saver instead.

WARNING:tensorflow:From train.py:129: The name tf.train.Saver is deprecated. Please use tf.compat.v1.train.Saver instead.

WARNING:tensorflow:From train.py:130: The name tf.Session is deprecated. Please use tf.compat.v1.Session instead.

WARNING:tensorflow:From train.py:130: The name tf.Session is deprecated. Please use tf.compat.v1.Session instead.

2021-06-29 09:27:46.283988: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-06-29 09:27:46.284790: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1639] Found device 0 with properties: 
name: Tesla K80 major: 3 minor: 7 memoryClockRate(GHz): 0.8235
pciBusID: 0000:00:04.0
2021-06-29 09:27:46.284891: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
2021-06-29 09:27:46.284942: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
2021-06-29 09:27:46.284991: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10
2021-06-29 09:27:46.285038: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10
2021-06-29 09:27:46.285081: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10
2021-06-29 09:27:46.285124: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10
2021-06-29 09:27:46.285169: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2021-06-29 09:27:46.285265: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-06-29 09:27:46.286123: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-06-29 09:27:46.286943: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1767] Adding visible gpu devices: 0
2021-06-29 09:27:46.291237: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
2021-06-29 09:27:46.292958: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1180] Device interconnect StreamExecutor with strength 1 edge matrix:
2021-06-29 09:27:46.292992: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1186]      0 
2021-06-29 09:27:46.293011: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1199] 0:   N 
2021-06-29 09:27:46.293289: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-06-29 09:27:46.294116: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-06-29 09:27:46.294914: W tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:39] Overriding allow_growth setting because the TF_FORCE_GPU_ALLOW_GROWTH environment variable is set. Original config value was 0.
2021-06-29 09:27:46.294960: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1325] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 10813 MB memory) -> physical GPU (device: 0, name: Tesla K80, pci bus id: 0000:00:04.0, compute capability: 3.7)
WARNING:tensorflow:From train.py:131: The name tf.global_variables_initializer is deprecated. Please use tf.compat.v1.global_variables_initializer instead.

WARNING:tensorflow:From train.py:131: The name tf.global_variables_initializer is deprecated. Please use tf.compat.v1.global_variables_initializer instead.

Victim Model || validation accuracy: 0.8653846085071564, watermark success: 0.10416666666666667

Jimntu commented 2 years ago

Same problem, I got even lower accuracy in mnist. Do you figure out why is that now?

NicolasMICAUX commented 2 years ago

Same problem, I got even lower accuracy in mnist. Do you figure out why is that now? No. It was part of a 1 week student prioject, so i didn't spend much time on this.

NicolasMICAUX commented 2 years ago

Hope you'll figured it out easily :)

bussfromspace commented 1 year ago

Hi, I had the same problem, but then I decided to decrease the trigger set size from 5000 (source class completely) to 100 and the results were inline with the paper. I did not change any other default configs for CIFAR10. Here my results:

Trigger set size	Test Accuracy	Watermark accuracy
5000	80.73	13.70
100	90.29	27.00
in paper	85.41	25.74

Note: I converted the implementation to Torch, so it might still not work for this repo.

Sun-Jiatuhao commented 1 year ago

Hi, I had the same problem, but then I decided to decrease the trigger set size from 5000 (source class completely) to 100 and the results were inline with the paper. I did not change any other default configs for CIFAR10. Here my results:

Trigger set size Test Accuracy Watermark accuracy 5000 80.73 13.70 100 90.29 27.00 in paper 85.41 25.74 Note: I converted the implementation to Torch, so it might still not work for this repo.

Hi，I had the same problem. Now I am converting the code to pytorch,but I met some problems.Can I communicate with you?

cleverhans-lab / entangled-watermark

Unable to replicate default training on CIFAR10 #5