tjuskyzhang / Scaled-YOLOv4-TensorRT

Got 100fps on TX2. Got 500fps on GeForce GTX 1660 Ti. If the project is useful to you, please Star it.
178 stars 41 forks source link

How can i make it more faster?? #8

Open tuneshverma opened 4 years ago

tuneshverma commented 4 years ago

Is there a way to make this loop more faster?

for (int i = 0; i < INPUT_H INPUT_W; i++) { data[b 3 INPUT_H INPUT_W + i] = pr_img.at(i)[2] / 255.0; data[b 3 INPUT_H INPUT_W + i + INPUT_H INPUT_W] = pr_img.at(i)[1] / 255.0; data[b 3 INPUT_H INPUT_W + i + 2 INPUT_H * INPUT_W] = pr_img.at(i)[0] / 255.0; }

tjuskyzhang commented 4 years ago

Is there a way to make this loop more faster?

for (int i = 0; i < INPUT_H INPUT_W; i++) { data[b 3 INPUT_H INPUT_W + i] = pr_img.atcv::Vec3b(i)[2] / 255.0; data[b 3 INPUT_H INPUT_W + i + INPUT_H INPUT_W] = pr_img.atcv::Vec3b(i)[1] / 255.0; data[b 3 INPUT_H INPUT_W + i + 2 INPUT_H * INPUT_W] = pr_img.atcv::Vec3b(i)[0] / 255.0; }

You can try to preprocess the input image on GPU.

tuneshverma commented 4 years ago

would using cv::normalize() be better here?