Closed drasmuss closed 1 week ago
Was able to reproduce the issue with TF v2.4.0rc0 and TF-nightly. However, code works fine with TF v2.3. Please find the attached gist. Thanks!
@drasmuss thanks for filing the issue, is there a reason that you disable eager execution ?
I have found that using the old training_v1 Keras implementation (which is what you get when disabling eager execution) is still faster for some models.
Hi @drasmuss can you share what the models are that you've found it's faster for? We want to make sure to fix up any performance regressions like that.
You can see an example here https://github.com/nengo/nengo-dl/blob/master/nengo_dl/tests/test_benchmarks.py#L238, specifically comparing the two test cases
(benchmarks.lmu(1000, 1, native_nengo=True), True, 100, True, 1.3, 1.5),
(benchmarks.lmu(1000, 1, native_nengo=True), True, 100, False, 1.05, 1.25),
The second case runs the same benchmark but with
tf.compat.v1.disable_eager_execution()
tf.compat.v1.disable_control_flow_v2()
and it runs about 0.25s (20%) faster.
Here is a Colab gist demonstrating the same thing if that helps https://colab.research.google.com/gist/drasmuss/45826df4e27dc6a21be961690d2a043f/performance-demo.ipynb
@drasmuss Is this still an issue? I ran your code (colab with out GPU) with tf-nightly
and I see almost similar execution times as shown below. Please check the gist here. Thanks!
Execution time: 14.798911161999968
Eager 14.798911161999968
Execution time: 14.530144375999953
Non-eager 14.530144375999953
I think we need to test it on the GPU in order to see whether this is still an issue, it's hard to know whether the CPU results are indicative of GPU performance.
Had a chance to test this, and it looks like the problem is the same (possibly worse) in tf-nightly
. Here are my results (note I couldn't get tf-nightly to run with GPU support on Colab, so this is on a local RTX 3090):
tf-nightly Eager: 0.8343444939237088 Non-eager: 0.6216731490567327 (~33% slowdown in eager mode)
Also note that tf-nightly seems to be significantly slower than older versions of tensorflow, e.g.
tensorflow 2.2.2 Eager: 0.7077119939494878 Non-eager: 0.5124576301313937 (~20% slowdown in tf-nightly vs tf 2.2; possibly related to https://github.com/tensorflow/tensorflow/issues/46515)
same issue for me with keras 2.3.1
from gym import Env
from gym.spaces import Discrete, Box
class FooEnv(Env):
metadata = {'render.modes': ['human']}
def __init__(self, training=True):
self.training = training
self.action_space = Discrete(5)
self.size = 11
self.observation_space = Box(low=np.array([0, 0]), high=np.array([self.size - 1, 1]))
self.position = -1
self.brightness = 0.0
self.state = [self.position, self.brightness]
self.action_steps = 30
self.action = -1
self.done = False
self.reward = 0
self.info = {}
self.seed(seed=45)
def get_position(self):
noise_factor = 0
position = 1 - (self.position / 5) * (self.brightness) + noise_factor
return position
def step(self, action, ext_position=-10):
action -= 2
self.action = action
self.brightness += int(action) / 10
if self.brightness > 1:
self.brightness = 1
if self.brightness < 0:
self.brightness = 0
if ext_position == -10:
self.position += self.get_position()
else:
self.position = ext_position
if self.position < 0:
self.position = 0
if self.position > self.size - 1:
self.position = self.size - 1
self.state = [self.position, self.brightness]
if 10 > self.position >= 0:
self.reward = 1 - abs(self.position / 10 - self.brightness)
elif self.position >= 10:
self.reward = -100
self.action_steps -= 1
self.done = self.check()
self.info = {}
return self.state, self.reward, self.done, self.info
def check(self):
done = self.done
if self.action_steps == 0:
done = True
return done
def reset(self):
self.position = -1
self.brightness = 0.0
self.state = [self.position, self.brightness]
self.action_steps = 30
self.reward = 0
self.action = -0
self.done = False
self.info = {}
return self.state
def render(self, mode='human', close=False):
for i in range(self.size):
if self.position >= i > self.position - 1:
print("+", end='')
else:
print("-", end='')
if self.done:
print("X| Pos:" + str(self.position) + " Brightness:" + str(self.brightness) + " Done:" + str(self.done) +
" Reward:" + str(self.reward) + " Steps:" + str(self.action_steps) + " Action:" + str(self.action))
else:
print("O| Pos:" + str(self.position) + " Brightness:" + str(self.brightness) + " Done:" + str(self.done) +
" Reward:" + str(self.reward) + " Steps:" + str(self.action_steps) + " Action:" + str(self.action))
import numpy as np
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Flatten
from tensorflow.keras.optimizers import Adam
env = FooEnv()
env.seed(0)
states = env.observation_space.shape
actions = env.action_space.n
def build_model(states, actions):
model = Sequential()
model.add(Flatten(input_shape=(1,) + states))
model.add(Dense(24, activation='relu'))
model.add(Dense(24, activation='relu'))
model.add(Dense(actions, activation='linear'))
return model
from rl.agents import DQNAgent
from keras.callbacks import TensorBoard
from rl.callbacks import ModelIntervalCheckpoint, FileLogger
from rl.policy import LinearAnnealedPolicy, EpsGreedyQPolicy
from rl.memory import SequentialMemory
model = build_model(states, actions)
model.summary()
def build_agent(model, actions):
policy = LinearAnnealedPolicy(EpsGreedyQPolicy(), attr='eps', value_max=1, value_min=0.1, value_test=0.05,
nb_steps=500)
memory = SequentialMemory(limit=10000, window_length=1)
dqn = DQNAgent(model=model, memory=memory, policy=policy, enable_double_dqn=True,
nb_actions=actions, gamma=.98, nb_steps_warmup=100, target_model_update=1e-2)
return dqn
callbacks = [TensorBoard(log_dir='./weights_test')]
dqn = build_agent(model, actions)
dqn.compile(Adam(lr=1e-3), metrics=['mae'])
dqn.fit(env, nb_steps=10000, log_interval=1000, nb_max_episode_steps=50, visualize=False, verbose=1,
callbacks=callbacks)
Was able to reproduce the issue in TF v2.5,please find the gist here...Thanks !
I also observed the following alternative names of the API have the same behavior that the object has no attribute
Exception is raised while eager execution is disabled.
(tf.keras.callbacks.TensorBoard)
, tf.compat.v1.keras.callbacks.TensorBoard
This behavior still exists in tensorflow nightly (2.15.0-dev20230907), and users should be cautious when using them on both CPU and GPU.
tf.compat.v1.keras.callbacks.TensorBoard
Hi,
Thank you for opening this issue. Since this issue has been open for a long time, the code/debug information for this issue may not be relevant with the current state of the code base.
The Tensorflow team is constantly improving the framework by fixing bugs and adding new features. We suggest you try the latest TensorFlow version with the latest compatible hardware configuration which could potentially resolve the issue. If you are still facing the issue, please create a new GitHub issue with your latest findings, with all the debugging information which could help us investigate.
Please follow the release notes to stay up to date with the latest developments which are happening in the Tensorflow space.
This issue is stale because it has been open for 7 days with no activity. It will be closed if no further activity occurs. Thank you.
This issue was closed because it has been inactive for 7 days since being marked as stale. Please reopen if you'd like to work on this further.
System information
Describe the current behavior
Attempting to use
tf.keras.callbacks.TensorBoard
with eager execution disabled results in an error in TF 2.4.Describe the expected behavior
There should be no error, callback should work as normal.
Standalone code to reproduce the issue
Other info / logs Include any logs or source code that would be helpful to diagnose the problem. If including tracebacks, please include the full traceback. Large logs and files should be attached.