haofeixu / aanet

[CVPR'20] AANet: Adaptive Aggregation Network for Efficient Stereo Matching
Apache License 2.0
524 stars 102 forks source link

Question about time profiling #22

Closed xy-guo closed 4 years ago

xy-guo commented 4 years ago

Thank you for your great work! I just tried your code and add --count_time to try to see the speed of the model. However, I found there is no torch.cuda.synchronize() after the running of the model. Since pytorch runs in an async way, I wonder whether this will affect the final measuring result.

I just tried to run aanet+ on a 2060s card, the speed is 68ms without synchronization, while the speed is 123ms after adding cuda synchronization.

The code is modified as follows in inference.py.

        with torch.no_grad():
        torch.cuda.synchronize()
        time_start = time.perf_counter()
        print(i, left.shape, right.shape)
        pred_disp = aanet(left, right)[-1]  # [B, H, W]
        torch.cuda.synchronize()
        inference_time += time.perf_counter() - time_start
haofeixu commented 4 years ago

Hi, thanks for your interest in our work. I think there would be some differences when enabling synchronization, but since many previous works in this field reported their inference time without synchronization (e.g., PSMNet and GA-Net), we just simply follow this setting.

However, we do believe that the inference time is implementation and hardware-dependent, and thus we have intergrated some representative methods in a same framework and used the same inference setting and hardware for efficiency comparison (e.g., Table 2 in our paper).

xy-guo commented 4 years ago

I modified the code as follows:

    torch.cuda.synchronize()
    time_start = time.perf_counter()
    for i in range(100):
        with torch.no_grad():
            # torch.cuda.synchronize()
            print(i, left.shape, right.shape)
            pred_disp = aanet(left, right)[-1]  # [B, H, W]
    torch.cuda.synchronize()
    inference_time += time.perf_counter() - time_start

The average inference time is still 123ms, so the inference time difference is not due to synchronization overhead. I think there may exist some errors in your reported time results.

haofeixu commented 4 years ago

Please note that the reported time is measured without synchronization to be consistent with previous methods.

xy-guo commented 4 years ago

But I didn't find any time profiling code in the previous released codes... As you know, the faster your model runs, the time measuring error becomes larger without synchronization.

haofeixu commented 4 years ago

FYI, the codes of PSMNet, GA-Net and HD3 are all publicly available.