With an image size of 800x1200, I get around 7.5 samples per second when training on 8 P6000 GPUs. For r-fcn, I used to get 16 samples per second using a caffe implementation for the same image size. Is this speed in line with your observations, or something is wrong with my runtime environment?
With an image size of 800x1200, I get around 7.5 samples per second when training on 8 P6000 GPUs. For r-fcn, I used to get 16 samples per second using a caffe implementation for the same image size. Is this speed in line with your observations, or something is wrong with my runtime environment?