ristea / aed-mae

Other
19 stars 1 forks source link

How are parameters and FPS measured? #5

Closed flww213 closed 1 month ago

flww213 commented 1 month ago

Your work is astonishing, and I'm curious about how you measured the 3M parameters and reached over 1600 FPS.

With inference.py, I'm getting about 30ms per frame on a single 3060. Could you please advise on how to speed this up? Or which script to use for better performance?

Thank you for your help.

CroitoruAlin commented 1 month ago

Hello! Indeed inference.py is not the best option to benchmark the processing time (because it does many other operations). we added a new script util/time_benchmark.py which is similar to what we used to measure the time. But you still need to modify the forward function of the model to NOT compute the loss. Also, the FPS we reported was computed on a 3090TI.

flww213 commented 1 month ago

Hello! Indeed inference.py is not the best option to benchmark the processing time (because it does many other operations). we added a new script util/time_benchmark.py which is similar to what we used to measure the time. But you still need to modify the forward function of the model to NOT compute the loss. Also, the FPS we reported was computed on a 3090TI.

Thank you for your response. I executed the time_benchmark.py script, yielding a mean time of 60.432288 ms and an FPS of 529.5182601724429. The speed is quite remarkable.

However, I noticed that the script at line 31 requires an additional input parameter "grad_mask". Failing to provide this results in an "AttributeError: 'NoneType' object has no attribute 'shape'" error. To circumvent this, I defined "grad_mask=torch.randn(batch_size, 3, img_size[0], img_size[1]).to(device)". This workaround seems to prevent the error from occurring, but it appears that a fix for this bug is necessary.

flww213 commented 1 month ago

I'm curious about the 3MB parameter size measurement. The student model weights you provided and my own are around 30MB. Could you explain the discrepancy?

raduionescu commented 1 month ago

Hi! If you refer to the number of params in Table 5, 3M means 3 million params, not 3Mb.

flww213 commented 1 month ago

Hi! If you refer to the number of params in Table 5, 3M means 3 million params, not 3Mb.

Apologies for the confusion earlier. I have now obtained a parameter size of 2.77312M.

CroitoruAlin commented 1 month ago

As

Hello! Indeed inference.py is not the best option to benchmark the processing time (because it does many other operations). we added a new script util/time_benchmark.py which is similar to what we used to measure the time. But you still need to modify the forward function of the model to NOT compute the loss. Also, the FPS we reported was computed on a 3090TI.

Thank you for your response. I executed the time_benchmark.py script, yielding a mean time of 60.432288 ms and an FPS of 529.5182601724429. The speed is quite remarkable.

However, I noticed that the script at line 31 requires an additional input parameter "grad_mask". Failing to provide this results in an "AttributeError: 'NoneType' object has no attribute 'shape'" error. To circumvent this, I defined "grad_mask=torch.randn(batch_size, 3, img_size[0], img_size[1]).to(device)". This workaround seems to prevent the error from occurring, but it appears that a fix for this bug is necessary.

The grad_mask is utilized only within the loss function. Therefore, when benchmarking the processing time, you should adjust the forward function to omit the computation of the loss, as it is unnecessary during testing. With this modification to the forward function, the grad_mask will become redundant.

flww213 commented 1 month ago

As

Hello! Indeed inference.py is not the best option to benchmark the processing time (because it does many other operations). we added a new script util/time_benchmark.py which is similar to what we used to measure the time. But you still need to modify the forward function of the model to NOT compute the loss. Also, the FPS we reported was computed on a 3090TI.

Thank you for your response. I executed the time_benchmark.py script, yielding a mean time of 60.432288 ms and an FPS of 529.5182601724429. The speed is quite remarkable. However, I noticed that the script at line 31 requires an additional input parameter "grad_mask". Failing to provide this results in an "AttributeError: 'NoneType' object has no attribute 'shape'" error. To circumvent this, I defined "grad_mask=torch.randn(batch_size, 3, img_size[0], img_size[1]).to(device)". This workaround seems to prevent the error from occurring, but it appears that a fix for this bug is necessary.

The grad_mask is utilized only within the loss function. Therefore, when benchmarking the processing time, you should adjust the forward function to omit the computation of the loss, as it is unnecessary during testing. With this modification to the forward function, the grad_mask will become redundant.

I have removed the part of loss calculation according to your reply, but the result is not very different (540FPS). Thank you for your reply.