juntang-zhuang / Adabelief-Optimizer

Repository for NeurIPS 2020 Spotlight "AdaBelief Optimizer: Adapting stepsizes by the belief in observed gradients"
BSD 2-Clause "Simplified" License
1.05k stars 109 forks source link

Tensorflow implementation doesn't work #2

Closed ben-arnao closed 4 years ago

ben-arnao commented 4 years ago

TF 2.3

from tensorflow.keras.layers import Dense
from tensorflow.keras.models import Sequential
import numpy as np
from adabelief_tf import AdaBeliefOptimizer

x = np.random.random_sample((5,))
y = np.random.random_sample((5,))

model = Sequential()
model.add(Dense(1))
model.compile(loss='mse',
              optimizer=AdaBeliefOptimizer())

model.fit(x, y)
Traceback (most recent call last):
  File "C:/Users/Ben/PycharmProjects/tradingbot/trainer/test.py", line 14, in <module>
    model.fit(x, y)
  File "C:\Users\Ben\PycharmProjects\tradingbot\venv\lib\site-packages\tensorflow\python\keras\engine\training.py", line 108, in _method_wrapper
    return method(self, *args, **kwargs)
  File "C:\Users\Ben\PycharmProjects\tradingbot\venv\lib\site-packages\tensorflow\python\keras\engine\training.py", line 1098, in fit
    tmp_logs = train_function(iterator)
  File "C:\Users\Ben\PycharmProjects\tradingbot\venv\lib\site-packages\tensorflow\python\eager\def_function.py", line 780, in __call__
    result = self._call(*args, **kwds)
  File "C:\Users\Ben\PycharmProjects\tradingbot\venv\lib\site-packages\tensorflow\python\eager\def_function.py", line 823, in _call
    self._initialize(args, kwds, add_initializers_to=initializers)
  File "C:\Users\Ben\PycharmProjects\tradingbot\venv\lib\site-packages\tensorflow\python\eager\def_function.py", line 696, in _initialize
    self._stateful_fn._get_concrete_function_internal_garbage_collected(  # pylint: disable=protected-access
  File "C:\Users\Ben\PycharmProjects\tradingbot\venv\lib\site-packages\tensorflow\python\eager\function.py", line 2855, in _get_concrete_function_internal_garbage_collected
    graph_function, _, _ = self._maybe_define_function(args, kwargs)
  File "C:\Users\Ben\PycharmProjects\tradingbot\venv\lib\site-packages\tensorflow\python\eager\function.py", line 3213, in _maybe_define_function
    graph_function = self._create_graph_function(args, kwargs)
  File "C:\Users\Ben\PycharmProjects\tradingbot\venv\lib\site-packages\tensorflow\python\eager\function.py", line 3065, in _create_graph_function
    func_graph_module.func_graph_from_py_func(
  File "C:\Users\Ben\PycharmProjects\tradingbot\venv\lib\site-packages\tensorflow\python\framework\func_graph.py", line 986, in func_graph_from_py_func
    func_outputs = python_func(*func_args, **func_kwargs)
  File "C:\Users\Ben\PycharmProjects\tradingbot\venv\lib\site-packages\tensorflow\python\eager\def_function.py", line 600, in wrapped_fn
    return weak_wrapped_fn().__wrapped__(*args, **kwds)
  File "C:\Users\Ben\PycharmProjects\tradingbot\venv\lib\site-packages\tensorflow\python\framework\func_graph.py", line 973, in wrapper
    raise e.ag_error_metadata.to_exception(e)
ValueError: in user code:

    C:\Users\Ben\PycharmProjects\tradingbot\venv\lib\site-packages\tensorflow\python\keras\engine\training.py:806 train_function  *
        return step_function(self, iterator)
    C:\Users\Ben\PycharmProjects\tradingbot\venv\lib\site-packages\tensorflow\python\keras\engine\training.py:796 step_function  **
        outputs = model.distribute_strategy.run(run_step, args=(data,))
    C:\Users\Ben\PycharmProjects\tradingbot\venv\lib\site-packages\tensorflow\python\distribute\distribute_lib.py:1211 run
        return self._extended.call_for_each_replica(fn, args=args, kwargs=kwargs)
    C:\Users\Ben\PycharmProjects\tradingbot\venv\lib\site-packages\tensorflow\python\distribute\distribute_lib.py:2585 call_for_each_replica
        return self._call_for_each_replica(fn, args, kwargs)
    C:\Users\Ben\PycharmProjects\tradingbot\venv\lib\site-packages\tensorflow\python\distribute\distribute_lib.py:2945 _call_for_each_replica
        return fn(*args, **kwargs)
    C:\Users\Ben\PycharmProjects\tradingbot\venv\lib\site-packages\tensorflow\python\keras\engine\training.py:789 run_step  **
        outputs = model.train_step(data)
    C:\Users\Ben\PycharmProjects\tradingbot\venv\lib\site-packages\tensorflow\python\keras\engine\training.py:756 train_step
        _minimize(self.distribute_strategy, tape, self.optimizer, loss,
    C:\Users\Ben\PycharmProjects\tradingbot\venv\lib\site-packages\tensorflow\python\keras\engine\training.py:2747 _minimize
        optimizer.apply_gradients(zip(gradients, trainable_variables))
    C:\Users\Ben\PycharmProjects\tradingbot\venv\lib\site-packages\tensorflow\python\keras\optimizers.py:775 apply_gradients
        self.optimizer.apply_gradients(grads, global_step=self.iterations)
    C:\Users\Ben\PycharmProjects\tradingbot\venv\lib\site-packages\tensorflow\python\training\optimizer.py:616 apply_gradients
        update_ops.append(processor.update_op(self, grad))
    C:\Users\Ben\PycharmProjects\tradingbot\venv\lib\site-packages\tensorflow\python\training\optimizer.py:171 update_op
        update_op = optimizer._resource_apply_dense(g, self._v)
    C:\Users\Ben\PycharmProjects\tradingbot\venv\lib\site-packages\adabelief_tf\AdaBelief_tf.py:187 _resource_apply_dense
        beta1_power = math_ops.cast(self._get_non_slot_variable("beta1_power", graph=graph), grad.dtype.base_dtype)
    C:\Users\Ben\PycharmProjects\tradingbot\venv\lib\site-packages\tensorflow\python\util\dispatch.py:201 wrapper
        return target(*args, **kwargs)
    C:\Users\Ben\PycharmProjects\tradingbot\venv\lib\site-packages\tensorflow\python\ops\math_ops.py:920 cast
        x = ops.convert_to_tensor(x, name="x")
    C:\Users\Ben\PycharmProjects\tradingbot\venv\lib\site-packages\tensorflow\python\framework\ops.py:1499 convert_to_tensor
        ret = conversion_func(value, dtype=dtype, name=name, as_ref=as_ref)
    C:\Users\Ben\PycharmProjects\tradingbot\venv\lib\site-packages\tensorflow\python\framework\constant_op.py:338 _constant_tensor_conversion_function
        return constant(v, dtype=dtype, name=name)
    C:\Users\Ben\PycharmProjects\tradingbot\venv\lib\site-packages\tensorflow\python\framework\constant_op.py:263 constant
        return _constant_impl(value, dtype, shape, name, verify_shape=False,
    C:\Users\Ben\PycharmProjects\tradingbot\venv\lib\site-packages\tensorflow\python\framework\constant_op.py:280 _constant_impl
        tensor_util.make_tensor_proto(
    C:\Users\Ben\PycharmProjects\tradingbot\venv\lib\site-packages\tensorflow\python\framework\tensor_util.py:444 make_tensor_proto
        raise ValueError("None values not supported.")

    ValueError: None values not supported.
juntang-zhuang commented 4 years ago

Please write model in TensorFlow not Keras, as example in the jupyter notebook. Currently I don't have a version compatible with Keras, actually I'm totally confused by the way Keras implements Adam and AdamW, so I don't have a good implementation for tensorflow.

Also note that decoupled weight decay is not available in adablief-tf. Apologize for this, but I recommend using adabelief-pytorch, as stated it's the only version I have extensively tested for the paper.

ben-arnao commented 4 years ago

Please write model in TensorFlow not Keras, as example in the jupyter notebook. Currently I don't have a version compatible with Keras, actually I'm totally confused by the way Keras implements Adam and AdamW, so I don't have a good implementation for tensorflow.

Also note that decoupled weight decay is not available in adablief-tf. Apologize for this, but I recommend using adabelief-pytorch, as stated it's the only version I have extensively tested for the paper.

Ok gotcha, thanks for the response

juntang-zhuang commented 4 years ago

@ben-arnao Hi, we have released "adabelief-tf==0.1.0", which is available on pip. It supports Tensorflow>=2.0 and Keras, and supports decoupled weight and rectification as the PyTorch implementation.