tensorflow / recommenders-addons

Additional utils and helpers to extend TensorFlow when build recommendation systems, contributed and maintained by SIG Recommenders.
Apache License 2.0
596 stars 137 forks source link

Demo(when indicate the device in SquashedEmbedding) not working with Apple M1: Optimizer type is not supported! got <class 'keras.src.optimizers.adam.Adam'> #482

Open Ross-Fan opened 4 days ago

Ross-Fan commented 4 days ago

System information

the demo is: https://github.com/tensorflow/recommenders-addons/tree/master/demo/dynamic_embedding/movielens-1m-keras with a tiny edit to indicate the device.

Describe the bug raise Exception(f"Optimizer type is not supported! got {str(type(self))}") Exception: Optimizer type is not supported! got <class 'keras.src.optimizers.adam.Adam'>

A clear and concise description of what the bug is. It seems the optimizer not working

Code to reproduce the issue in the attached file tfra_mlen_1m.py.zip I have indicated the devices=['/cpu:0'], if not, it will cause error. self.user_embedding = de.keras.layers.SquashedEmbedding( user_embedding_size, initializer=embedding_initializer, name='user_embedding', devices=['/cpu:0'],) self.movie_embedding = de.keras.layers.SquashedEmbedding( movie_embedding_size, initializer=embedding_initializer, name='movie_embedding', devices=['/cpu:0'],)

Provide a reproducible test case that is the bare minimum necessary to generate the problem.

Other info / logs `

python3 tfra_mlen_1m.py --mode=train --epochs=1 --steps_per_epoch=200 WARNING:tensorflow:dynamic_embedding.GraphKeys has already been deprecated. The Variable will not be added to collections because it does not actully own any value, but only a holder of tables, which may lead to import_meta_graph failed since non-valued object has been added to collection. If you need to use tf.compat.v1.train.Saver and access all Variables from collection, you could manually add it to the collection by tf.compat.v1.add_to_collections(names, var) instead. _SingleDeviceSaver removed after tf version 2.15 WARNING:tensorflow:An exception occurred when import horovod.tensorflow: No module named 'horovod' WARNING:tensorflow:An exception occurred when import horovod.tensorflow: No module named 'horovod' I1125 11:52:39.532697 8455666368 dataset_info.py:599] Load dataset info from /Users/fanwei/tensorflow_datasets/movielens/1m-ratings/0.1.1 I1125 11:52:39.540530 8455666368 dataset_builder.py:573] Reusing dataset movielens (/Users/fanwei/tensorflow_datasets/movielens/1m-ratings/0.1.1) 2024-11-25 11:52:39.545550: I metal_plugin/src/device/metal_device.cc:1154] Metal device set to: Apple M1 2024-11-25 11:52:39.545570: I metal_plugin/src/device/metal_device.cc:296] systemMemory: 16.00 GB 2024-11-25 11:52:39.545576: I metal_plugin/src/device/metal_device.cc:313] maxCacheSize: 5.33 GB 2024-11-25 11:52:39.545610: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:305] Could not identify NUMA node of platform GPU ID 0, defaulting to 0. Your kernel may not have been built with NUMA support. 2024-11-25 11:52:39.545622: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:271] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 0 MB memory) -> physical PluggableDevice (device: 0, name: METAL, pci bus id: ) I1125 11:52:39.617144 8455666368 logging_logger.py:49] Constructing tf.data.Dataset movielens for split train, from /Users/fanwei/tensorflow_datasets/movielens/1m-ratings/0.1.1 2024-11-25 11:52:39.760150: I ./tensorflow_recommenders_addons/dynamic_embedding/core/kernels/lookup_impl/lookup_table_op_cpu.h:157] HashTable on CPU is created on optimized mode: K=x, V=f, DIM=32, init_size=8192 2024-11-25 11:52:39.771784: I ./tensorflow_recommenders_addons/dynamic_embedding/core/kernels/lookup_impl/lookup_table_op_cpu.h:157] HashTable on CPU is created on optimized mode: K=x, V=f, DIM=32, init_size=8192 Traceback (most recent call last): File "/Users/fanwei/work/moviebox/panlong_jobs/排序/test/tfra_mlen_1m.py", line 207, in app.run(main) File "/Users/fanwei/miniforge3/envs/factormachine/lib/python3.9/site-packages/absl/app.py", line 308, in run _run_main(main, args) File "/Users/fanwei/miniforge3/envs/factormachine/lib/python3.9/site-packages/absl/app.py", line 254, in _run_main sys.exit(main(argv)) File "/Users/fanwei/work/moviebox/panlong_jobs/排序/test/tfra_mlen_1m.py", line 197, in main train() File "/Users/fanwei/work/moviebox/panlong_jobs/排序/test/tfra_mlen_1m.py", line 125, in train optimizer = de.DynamicEmbeddingOptimizer(tf.keras.optimizers.Adam(learning_rate=1E-3)) File "/Users/fanwei/miniforge3/envs/factormachine/lib/python3.9/site-packages/tensorflow_recommenders_addons/dynamic_embedding/python/ops/dynamic_embedding_optimizer.py", line 859, in DynamicEmbeddingOptimizer raise Exception(f"Optimizer type is not supported! got {str(type(self))}") Exception: Optimizer type is not supported! got <class 'keras.src.optimizers.adam.Adam'>

`

Include any logs or source code that would be helpful to diagnose the problem. If including tracebacks, please include the full traceback. Large logs and files should be attached.