Open LuisFMCuriel opened 10 months ago
Hello, @LuisFMCuriel ! Could you please have a look at this gist where I tried to replicate the error reported? Please confirm the result. Thank you!
Hi @sushreebarsa ! Thanks for the quick replay. Yes, I confirm the result. Running the example for each (for example for 60 episodes), the tensorflow implementation has an elapsed time of 00:00:37 whereas the pytorch implementation has an elapsed time of only 00:00:09, having roughly the same reward (these numbers might change since seed is not set, but the behaviour is the same every time you run it). Is this a problem with my implementation of the neural gradient optimization? The program gives you the optimization time, and it looks to be approximately the same for both, so I am not sure where is the timing bottleneck.
Thanks again for the time!!
Issue type
Performance
Have you reproduced the bug with TensorFlow Nightly?
Yes
Source
source
TensorFlow version
tf2.13
Custom code
Yes
OS platform and distribution
Ubuntu 22.04.2 LTS
Mobile device
No response
Python version
3.10.12
Bazel version
No response
GCC/compiler version
No response
CUDA/cuDNN version
11.8
GPU model and memory
T4 GPU
Current behavior?
I have encountered a performance difference between my PyTorch and TensorFlow implementations of the Double Deep Q-Network (DDQN) algorithm in a Gym environment. Both implementations share identical DDQN architectures, exploration routines, and training flows. However, the PyTorch implementation consistently converges faster, requiring fewer episodes to solve the environment.
Details:
Network and Training Flow: Both PyTorch and TensorFlow implementations employ the same dense neural network architecture (two hidden layers of [512, 128]) and training procedures. The timing for weight optimization is nearly identical between the two.
Exploration: Identical exploration strategies are employed in both implementations, ensuring consistency in agent behavior. Both of them use epsilon-greedy exploration.
The optimization code for each framework is: Tensorflow:
Pytorch:
I'm seeking guidance on potential factors that might explain this performance gap. Could variations in internal library optimizations, autograd systems, GPU utilization, or other factors play a role in this discrepancy? Any insights or suggestions for further investigation would be greatly appreciated.
You can access to an example by following this link:![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)
Standalone code to reproduce the issue
Relevant log output
No response