wulalalalalalala commented 2 years ago

Thanks for your contribution.

But the inference time I calculated is quite different from it mentioned in the paper. It takes more than 100ms for a 4K image. The code I use is as follows. ''' model = B_transformer().cuda() a = torch.randn(1, 3, 1024, 1024).cuda() starter, ender = torch.cuda.Event(enable_timing=True), torch.cuda.Event(enable_timing=True) repetitions = 100 timings = np.zeros((repetitions, 1))

GPU-WARM-UP

for _ in range(50):
    enhanced_image = model(a)
# MEASURE PERFORMANCE
with torch.no_grad():
    for rep in range(repetitions):
        torch.cuda.synchronize()
        starter.record()
        enhanced_image = model(a)
        ender.record()
        # WAIT FOR GPU SYNC
        torch.cuda.synchronize()
        curr_time = starter.elapsed_time(ender)
        timings[rep] = curr_time
mean_syn = np.sum(timings) / repetitions
std_syn = np.std(timings)
print(mean_syn)

''' Is this right?

zzr-idam commented 2 years ago

Our model is quick for inference and may have something to do with your cold start.

wulalalalalalala commented 2 years ago

Thank you for your reply. But my code contains the GPU warm-up step. If you don't mind, could you provide the correct code for measuring the inference time?

zhangn77 commented 2 years ago

Our model is quick for inference and may have something to do with your cold start.

I have the same question. The running time test on the Telsa V100 GPU for a 1024X1024 image comes to 100ms, which is quiet different from the data in your paper (9ms for a 4k image). Can you share the code for measuring the inference time?

zzr-idam / 4KDehazing

About the inference time of the network #8

GPU-WARM-UP