The code works perfectly well when I used the line model = tf.keras.Sequential() instead of model = DPSequential(l2_norm_clip=1.0, noise_multiplier=1.1)
When I used using DPSequential and GPU for training, I got this error
2023-11-27 11:35:21.698262: I tensorflow/core/platform/cpu_feature_guard.cc:151] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2023-11-27 11:35:22.224200: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 20658 MB memory: -> device: 0, name: NVIDIA A10, pci bus id: 0000:ca:00.0, compute capability: 8.6
2023-11-27 11:35:22.591087: W tensorflow/core/framework/cpu_allocator_impl.cc:82] Allocation of 2018750000 exceeds 10% of free system memory.
Epoch 1/5
WARNING:tensorflow:Using a while_loop for converting StringSplit
WARNING:tensorflow:Using a while_loop for converting StringSplit
WARNING:tensorflow:Using a while_loop for converting LookupTableFindV2
WARNING:tensorflow:Using a while_loop for converting LookupTableFindV2
WARNING:tensorflow:Using a while_loop for converting LookupTableSizeV2
WARNING:tensorflow:Using a while_loop for converting LookupTableSizeV2
WARNING:tensorflow:Using a while_loop for converting LookupTableFindV2
WARNING:tensorflow:Using a while_loop for converting LookupTableFindV2
WARNING:tensorflow:Using a while_loop for converting LookupTableSizeV2
WARNING:tensorflow:Using a while_loop for converting LookupTableSizeV2
WARNING:tensorflow:Using a while_loop for converting StringToHashBucketFast
WARNING:tensorflow:Using a while_loop for converting StringToHashBucketFast
WARNING:tensorflow:Using a while_loop for converting SparseFillEmptyRows
WARNING:tensorflow:Using a while_loop for converting SparseFillEmptyRows
WARNING:tensorflow:Using a while_loop for converting Unique
WARNING:tensorflow:Using a while_loop for converting Unique
WARNING:tensorflow:Using a while_loop for converting ResourceGather
WARNING:tensorflow:Using a while_loop for converting ResourceGather
WARNING:tensorflow:Using a while_loop for converting SparseSegmentSqrtN
WARNING:tensorflow:Using a while_loop for converting SparseSegmentSqrtN
WARNING:tensorflow:Using a while_loop for converting StringSplit
WARNING:tensorflow:Using a while_loop for converting StringSplit
WARNING:tensorflow:Using a while_loop for converting LookupTableFindV2
WARNING:tensorflow:Using a while_loop for converting LookupTableFindV2
WARNING:tensorflow:Using a while_loop for converting LookupTableSizeV2
WARNING:tensorflow:Using a while_loop for converting LookupTableSizeV2
WARNING:tensorflow:Using a while_loop for converting LookupTableFindV2
WARNING:tensorflow:Using a while_loop for converting LookupTableFindV2
WARNING:tensorflow:Using a while_loop for converting LookupTableSizeV2
WARNING:tensorflow:Using a while_loop for converting LookupTableSizeV2
WARNING:tensorflow:Using a while_loop for converting StringToHashBucketFast
WARNING:tensorflow:Using a while_loop for converting StringToHashBucketFast
WARNING:tensorflow:Using a while_loop for converting SparseFillEmptyRows
WARNING:tensorflow:Using a while_loop for converting SparseFillEmptyRows
WARNING:tensorflow:Using a while_loop for converting Unique
WARNING:tensorflow:Using a while_loop for converting Unique
WARNING:tensorflow:Using a while_loop for converting ResourceGather
WARNING:tensorflow:Using a while_loop for converting ResourceGather
WARNING:tensorflow:Using a while_loop for converting SparseSegmentSqrtN
WARNING:tensorflow:Using a while_loop for converting SparseSegmentSqrtN
2023-11-27 11:35:28.468623: W tensorflow/core/framework/op_kernel.cc:1745] OP_REQUIRES failed at xla_ops.cc:241 : INVALID_ARGUMENT: Trying to access resource hash_table_/tmp/tmpPJ1wRH/tokens.txt_-2_-1_load_1_12 located in device /job:localhost/replica:0/task:0/device:CPU:0 from device /job:localhost/replica:0/task:0/device:GPU:0
Traceback (most recent call last):
File "/home/local/ASUAD/hguan6/Model_Deduplication_Train_Text_Classification_Model/model_trainer_sentiment_task.py", line 158, in <module>
main()
File "/home/local/ASUAD/hguan6/Model_Deduplication_Train_Text_Classification_Model/model_trainer_sentiment_task.py", line 144, in main
model.fit(
File "/home/local/ASUAD/hguan6/miniconda3/envs/model_dedup/lib/python3.9/site-packages/keras/utils/traceback_utils.py", line 67, in error_handler
raise e.with_traceback(filtered_tb) from None
File "/home/local/ASUAD/hguan6/miniconda3/envs/model_dedup/lib/python3.9/site-packages/tensorflow/python/eager/execute.py", line 54, in quick_execute
tensors = pywrap_tfe.TFE_Py_Execute(ctx._handle, device_name, op_name,
tensorflow.python.framework.errors_impl.InvalidArgumentError: Graph execution error:
Detected at node 'StatefulPartitionedCall' defined at (most recent call last):
File "/home/local/ASUAD/hguan6/Model_Deduplication_Train_Text_Classification_Model/model_trainer_sentiment_task.py", line 158, in <module>
main()
File "/home/local/ASUAD/hguan6/Model_Deduplication_Train_Text_Classification_Model/model_trainer_sentiment_task.py", line 144, in main
model.fit(
File "/home/local/ASUAD/hguan6/miniconda3/envs/model_dedup/lib/python3.9/site-packages/keras/utils/traceback_utils.py", line 64, in error_handler
return fn(*args, **kwargs)
File "/home/local/ASUAD/hguan6/miniconda3/envs/model_dedup/lib/python3.9/site-packages/keras/engine/training.py", line 1384, in fit
tmp_logs = self.train_function(iterator)
File "/home/local/ASUAD/hguan6/miniconda3/envs/model_dedup/lib/python3.9/site-packages/keras/engine/training.py", line 1021, in train_function
return step_function(self, iterator)
File "/home/local/ASUAD/hguan6/miniconda3/envs/model_dedup/lib/python3.9/site-packages/keras/engine/training.py", line 1010, in step_function
outputs = model.distribute_strategy.run(run_step, args=(data,))
File "/home/local/ASUAD/hguan6/miniconda3/envs/model_dedup/lib/python3.9/site-packages/keras/engine/training.py", line 1000, in run_step
outputs = model.train_step(data)
Node: 'StatefulPartitionedCall'
Trying to access resource hash_table_/tmp/tmpPJ1wRH/tokens.txt_-2_-1_load_1_12 located in device /job:localhost/replica:0/task:0/device:CPU:0 from device /job:localhost/replica:0/task:0/device:GPU:0
[[{{node StatefulPartitionedCall}}]] [Op:__inference_train_function_5501]
When I used CPU for training, I got
2023-11-27 11:39:31.506582: E tensorflow/stream_executor/cuda/cuda_driver.cc:271] failed call to cuInit: CUDA_ERROR_NO_DEVICE: no CUDA-capable device is detected
2023-11-27 11:39:31.506623: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:169] retrieving CUDA diagnostic information for host: en4217519l
2023-11-27 11:39:31.506631: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:176] hostname: en4217519l
2023-11-27 11:39:31.507183: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:200] libcuda reported version is: 515.43.4
2023-11-27 11:39:31.507210: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:204] kernel reported version is: 515.43.4
2023-11-27 11:39:31.507218: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:310] kernel version seems to match DSO: 515.43.4
2023-11-27 11:39:31.508100: I tensorflow/core/platform/cpu_feature_guard.cc:151] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2023-11-27 11:39:31.602036: W tensorflow/core/framework/cpu_allocator_impl.cc:82] Allocation of 2018750000 exceeds 10% of free system memory.
Epoch 1/5
WARNING:tensorflow:Using a while_loop for converting StringSplit
WARNING:tensorflow:Using a while_loop for converting StringSplit
WARNING:tensorflow:Using a while_loop for converting LookupTableFindV2
WARNING:tensorflow:Using a while_loop for converting LookupTableFindV2
WARNING:tensorflow:Using a while_loop for converting LookupTableSizeV2
WARNING:tensorflow:Using a while_loop for converting LookupTableSizeV2
WARNING:tensorflow:Using a while_loop for converting LookupTableFindV2
WARNING:tensorflow:Using a while_loop for converting LookupTableFindV2
WARNING:tensorflow:Using a while_loop for converting LookupTableSizeV2
WARNING:tensorflow:Using a while_loop for converting LookupTableSizeV2
WARNING:tensorflow:Using a while_loop for converting StringToHashBucketFast
WARNING:tensorflow:Using a while_loop for converting StringToHashBucketFast
WARNING:tensorflow:Using a while_loop for converting SparseFillEmptyRows
WARNING:tensorflow:Using a while_loop for converting SparseFillEmptyRows
WARNING:tensorflow:Using a while_loop for converting Unique
WARNING:tensorflow:Using a while_loop for converting Unique
WARNING:tensorflow:Using a while_loop for converting ResourceGather
WARNING:tensorflow:Using a while_loop for converting ResourceGather
WARNING:tensorflow:Using a while_loop for converting SparseSegmentSqrtN
WARNING:tensorflow:Using a while_loop for converting SparseSegmentSqrtN
WARNING:tensorflow:Using a while_loop for converting StringSplit
WARNING:tensorflow:Using a while_loop for converting StringSplit
WARNING:tensorflow:Using a while_loop for converting LookupTableFindV2
WARNING:tensorflow:Using a while_loop for converting LookupTableFindV2
WARNING:tensorflow:Using a while_loop for converting LookupTableSizeV2
WARNING:tensorflow:Using a while_loop for converting LookupTableSizeV2
WARNING:tensorflow:Using a while_loop for converting LookupTableFindV2
WARNING:tensorflow:Using a while_loop for converting LookupTableFindV2
WARNING:tensorflow:Using a while_loop for converting LookupTableSizeV2
WARNING:tensorflow:Using a while_loop for converting LookupTableSizeV2
WARNING:tensorflow:Using a while_loop for converting StringToHashBucketFast
WARNING:tensorflow:Using a while_loop for converting StringToHashBucketFast
WARNING:tensorflow:Using a while_loop for converting SparseFillEmptyRows
WARNING:tensorflow:Using a while_loop for converting SparseFillEmptyRows
WARNING:tensorflow:Using a while_loop for converting Unique
WARNING:tensorflow:Using a while_loop for converting Unique
WARNING:tensorflow:Using a while_loop for converting ResourceGather
WARNING:tensorflow:Using a while_loop for converting ResourceGather
WARNING:tensorflow:Using a while_loop for converting SparseSegmentSqrtN
WARNING:tensorflow:Using a while_loop for converting SparseSegmentSqrtN
2023-11-27 11:39:37.224042: I tensorflow/compiler/xla/service/service.cc:171] XLA service 0x7f53b01efdb0 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2023-11-27 11:39:37.224100: I tensorflow/compiler/xla/service/service.cc:179] StreamExecutor device (0): Host, Default Version
2023-11-27 11:39:37.637389: W tensorflow/core/framework/op_kernel.cc:1745] OP_REQUIRES failed at xla_ops.cc:248 : INVALID_ARGUMENT: Detected unsupported operations when trying to compile graph __inference_train_step_5448[] on XLA_CPU_JIT: _Arg (No registered '_Arg' OpKernel for XLA_CPU_JIT devices compatible with node {{node data}}
(OpKernel was found, but attributes didn't match) Requested Attributes: T=DT_STRING, _output_shapes=[[32]], _user_specified_name="data", index=0){{node data}}
The op is created at:
File "Model_Deduplication_Train_Text_Classification_Model/model_trainer_sentiment_task.py", line 158, in <module>
main()
File "Model_Deduplication_Train_Text_Classification_Model/model_trainer_sentiment_task.py", line 144, in main
model.fit(
File "miniconda3/envs/model_dedup/lib/python3.9/site-packages/keras/utils/traceback_utils.py", line 64, in error_handler
return fn(*args, **kwargs)
File "miniconda3/envs/model_dedup/lib/python3.9/site-packages/keras/engine/training.py", line 1384, in fit
tmp_logs = self.train_function(iterator)
File "miniconda3/envs/model_dedup/lib/python3.9/site-packages/keras/engine/training.py", line 1021, in train_function
return step_function(self, iterator)
File "miniconda3/envs/model_dedup/lib/python3.9/site-packages/keras/engine/training.py", line 1010, in step_function
outputs = model.distribute_strategy.run(run_step, args=(data,))
File "miniconda3/envs/model_dedup/lib/python3.9/site-packages/keras/engine/training.py", line 1000, in run_step
outputs = model.train_step(data)
Traceback (most recent call last):
File "/home/local/ASUAD/hguan6/Model_Deduplication_Train_Text_Classification_Model/model_trainer_sentiment_task.py", line 158, in <module>
main()
File "/home/local/ASUAD/hguan6/Model_Deduplication_Train_Text_Classification_Model/model_trainer_sentiment_task.py", line 144, in main
model.fit(
File "/home/local/ASUAD/hguan6/miniconda3/envs/model_dedup/lib/python3.9/site-packages/keras/utils/traceback_utils.py", line 67, in error_handler
raise e.with_traceback(filtered_tb) from None
File "/home/local/ASUAD/hguan6/miniconda3/envs/model_dedup/lib/python3.9/site-packages/tensorflow/python/eager/execute.py", line 54, in quick_execute
tensors = pywrap_tfe.TFE_Py_Execute(ctx._handle, device_name, op_name,
tensorflow.python.framework.errors_impl.InvalidArgumentError: Graph execution error:
Detected unsupported operations when trying to compile graph __inference_train_step_5448[] on XLA_CPU_JIT: _Arg (No registered '_Arg' OpKernel for XLA_CPU_JIT devices compatible with node {{node data}}
(OpKernel was found, but attributes didn't match) Requested Attributes: T=DT_STRING, _output_shapes=[[32]], _user_specified_name="data", index=0){{node data}}
The op is created at:
File "Model_Deduplication_Train_Text_Classification_Model/model_trainer_sentiment_task.py", line 158, in <module>
main()
File "Model_Deduplication_Train_Text_Classification_Model/model_trainer_sentiment_task.py", line 144, in main
model.fit(
File "miniconda3/envs/model_dedup/lib/python3.9/site-packages/keras/utils/traceback_utils.py", line 64, in error_handler
return fn(*args, **kwargs)
File "miniconda3/envs/model_dedup/lib/python3.9/site-packages/keras/engine/training.py", line 1384, in fit
tmp_logs = self.train_function(iterator)
File "miniconda3/envs/model_dedup/lib/python3.9/site-packages/keras/engine/training.py", line 1021, in train_function
return step_function(self, iterator)
File "miniconda3/envs/model_dedup/lib/python3.9/site-packages/keras/engine/training.py", line 1010, in step_function
outputs = model.distribute_strategy.run(run_step, args=(data,))
File "miniconda3/envs/model_dedup/lib/python3.9/site-packages/keras/engine/training.py", line 1000, in run_step
outputs = model.train_step(data)
[[StatefulPartitionedCall]] [Op:__inference_train_function_5501]
Maybe it is because of hub.KerasLayer includes a preprocessing layer, an embedding layer, and embedding lookup layer (https://www.kaggle.com/models/google/wiki-words/frameworks/tensorFlow2).
I also tried other approaches, such as a normal Keras model with DP optimizer and custom training loop, but still didn't make it work.
Is there a workaround for this hub layer?
Platform: Linux TensorFlow version: 2.8.1 TensorFlow Privacy version: 0.8.0 My code:
The code works perfectly well when I used the line
model = tf.keras.Sequential()
instead ofmodel = DPSequential(l2_norm_clip=1.0, noise_multiplier=1.1)
When I used using
DPSequential
and GPU for training, I got this errorWhen I used CPU for training, I got
Maybe it is because of hub.KerasLayer includes a preprocessing layer, an embedding layer, and embedding lookup layer (https://www.kaggle.com/models/google/wiki-words/frameworks/tensorFlow2). I also tried other approaches, such as a normal Keras model with DP optimizer and custom training loop, but still didn't make it work. Is there a workaround for this hub layer?