danielenricocahall / elephas

Distributed Deep learning with Keras & Spark
MIT License
17 stars 5 forks source link

Support for Python 3.11 #25

Closed danielenricocahall closed 11 months ago

danielenricocahall commented 11 months ago

Updating to support Python 3.11, which also requires supporting later versions of Tensorflow (> 2.10). When Tensorflow is updated, however, we get the following error in multiple unit tests:

self = <pyspark.cloudpickle.cloudpickle_fast.CloudPickler object at 0x7fe7fc26a1c0>
obj = (<function RDD.mapPartitions.<locals>.func at 0x7fe80bb3a790>, None, BatchedSerializer(CloudPickleSerializer(), 10), AutoBatchedSerializer(CloudPickleSerializer()))

    def dump(self, obj):
        try:
>           return Pickler.dump(self, obj)
E           TypeError: cannot pickle 'weakref' object

Upon reviewing the Tensorflow releases, one highlight in the 2.11.0 release notes is the tf.keras.optimizers.Optimizer now uses the new base class. When the optimizer import is changed to tf.keras.optimizers.legacy, the tests pass. This suggests that something in the new Optimizer implementation is not serializable.

Per the notes, "The old Keras optimizer will never be deleted, but will not see any new feature additions.", so we can continue using the legacy optimizer indefinitely, although as a long-term strategy I would like to see if there is a work around to enable developers to use the new optimizer. I have created an Issue to track this here. For developers using a version of Tensorflow which < 2.11, importing from the optimizer package (tf.keras.optimizers) will still function as expected.