Open StonepageVan opened 2 years ago
Hello,
Thanks for your interest in our work. Unfortunately, we have not analyzed the implmenetation by Pritam-N so we cannot comment on it.
Although a 5s average speed seems to be very slow. My guess would be the some operations might have been executed on the CPU. Maybe you can look into that. Hope it helps.
Hello, my friend, appreciate for your great work! I have tested the code on https://github.com/Pritam-N/ParNet by Pritam-N and change the ResNet code in my model by using your ParNet , but the actual time is quite slow than the paper said. My block size is [64, 128, 256, 512, 2048], and the time of "forward()" is more than 5s average while the Resnet is 0.02s in my device. I have use the time function for every line in the forward(), find that the encode stuff is the main reason. I continue write time.perf_counter() in the encode stuff, find that the "self.stream2_fusion" and "self.stream3_fusion" is the most time user. Do you know why ?