Closed imoneoi closed 8 months ago
Hi @imoneoi , thanks for running these benchmarks on an RTX!
Indeed there is a slow down compared to v1, here are a few notable reasons:
This is still a WIP! We have our eye on speeding things up for both generalized and positional, and we're open to contributions with notable speed-up gains
Thanks! I also noticed that v2 pbd may be more accurate than v1.
BTW, are there any recommended tools for profiling and locating bottlenecks in Brax?
Thanks! I also noticed that v2 pbd may be more accurate than v1.
BTW, are there any recommended tools for profiling and locating bottlenecks in Brax?
I am not sure about Brax specifically, but I think it would be good to start with Profiling JAX Programs? If you are using Colab, you can see google/jax#3694 as well.
Why is the positional backend (pbd) in Brax v2 about 3x slower than v1 on humanoid? I observed that substep and dt may have differences, but v1 has a total of 533 pbd steps, and v2 has a total of 666 pbd steps, the difference should not be that big.
Here are the benchmark results:
The benchmark code is as following: