uber-research / DeepPruner

DeepPruner: Learning Efficient Stereo Matching via Differentiable PatchMatch (ICCV 2019)
Other
354 stars 41 forks source link

why the inference time differs greatly? #16

Closed oukohou closed 4 years ago

oukohou commented 4 years ago

my test code of time costs is as following:

def test(imgL, imgR):
    model.eval()
    with torch.no_grad():
        imgL = Variable(torch.FloatTensor(imgL))
        imgR = Variable(torch.FloatTensor(imgR))
        imgL, imgR = imgL.cuda(), imgR.cuda()

        for i in range(10):
            start_time = time.time()
            refined_disparity = model(imgL, imgR)
            end_time = time.time()
            print("time costs:{}".format(end_time - start_time))
    return refined_disparity

and the outputs is : image

my config is as :

config = {
    "max_disp": 192,
    "cost_aggregator_scale": 4, # for DeepPruner-fast change this to 8.
    "mode": "evaluation", # for evaluation/ submission, change this to evaluation.
    "feature_extractor_ca_level_outplanes": 32,
    "feature_extractor_refinement_level_outplanes": 32, # for DeepPruner-fast change this to 64.
    "feature_extractor_refinement_level_1_outplanes": 32,
    "patch_match_args": {
        "sample_count": 12,
        "iteration_count": 2,
        "propagation_filter_size": 3
    },
    "post_CRP_sample_count": 7,
    "post_CRP_sampler_type": "uniform", #change to patch_match for Sceneflow model. 
    "hourglass_inplanes": 16
}

and the results is even accurate, why the time differs so greatly? And the first time runs faster 10 times then the later?

Environments: GTX1080, python 3.5, Ubuntu16.04

ShivamDuggal4 commented 4 years ago

Hi @oukohou
The first time is different from the rest of the measurements because of CUDA START time. You should warm up the GPU clock for at least a couple seconds. So skip the initial few measurements and then start measuring time. You can read more about it: https://pytorch.org/docs/stable/bottleneck.html

Also, CUDA code will be operating in an asynchronous manner, so its better to add torch.cuda.synchronize().
Thanks, let me know for further questions.

oukohou commented 4 years ago

@ShivamDuggal4 Well, that's really great help, THANKS!