hi, I used prnn on my P40(30sm and 128sp) using same config with titan x(sm>=24, major=5). When I used mini-batch=4, layer-size=1152, layers=1, timesteps=300, it only has 1.3TFlops. In my opinion, P40 is faster than titan x. So it should has 2.8TFlops the same with titan x. Why dones‘t it. Error config or some other reasons? Please help me find the mistakes.
hi, I used prnn on my P40(30sm and 128sp) using same config with titan x(sm>=24, major=5). When I used mini-batch=4, layer-size=1152, layers=1, timesteps=300, it only has 1.3TFlops. In my opinion, P40 is faster than titan x. So it should has 2.8TFlops the same with titan x. Why dones‘t it. Error config or some other reasons? Please help me find the mistakes.