Open wehlutyk opened 6 years ago
Profiling tensorflow is complicated. Here are the links I have so far:
So extracting the info from profiler-ui gives the following timing logs:
In both cases most of the time is spent in Adam. There is no obvious bottleneck from what I saw, but it looks like the complexity of the model increases the complexity of the Adam part.
Option 1: I need something like snakeviz to go through this (it's unwieldy otherwise), so could do a quick prototype in Elm.
Option 2: since we don't use minibatching, disable it as much as possible and get performance gains from lessened code generality.
Option 3: put this on hold since we won't use it for an immediate paper, and move on to feature-network dependencies
Also profile full batch on CBP to see why we're only utilizing ~60% of the GPU