Open burlen opened 1 year ago
Takeaway: The temporal reduction is much faster on the GPU. I/O is slower, and has a lot more variability when GPU is used. Timing captures everything within execute of each stage
I redid the tests this time going to larger steps per request. The same patterns appear.
time each stage in the app. this may need work/cleanup before merge this info is already captured by the profiler.