Closed ErenBalatkan closed 3 years ago
Hi, it is not an expected result, we test the throughput of CSWin-L , which is 38.4, and Swin-L is 42.1. Could you provide a small test script?
In that case let me see if I can pinpoint and fix the issue myself and if not I will share with you guys the code, closing for now.
Update: It was a mistake in my part, was measuring timings without torch.cuda.synchronize()
Greetings,
From my benchmarks I have noticed that CSwin seems to be significantly slower than Swin when it comes to inference times, is this the expected behavior? While I can get predictions as fast as 20 miliseconds on Swin Large 384 it takes above 900 milisecond on CSWin_144_24322_large_384.
I performed tests using FP16, torchscript, optimize_for_inference and torch.inference_mode