Describe the bug
I do some optimization in my pytorch code, and got ~5x acceleration in pytorch inference, but when I export the model to onnx format and infer it in onnxruntime, I can not get the acceleration.
the optimization is like: do less calculation in some case. actually, it is a cache mechanism, reuse some of the result.
Urgency
if it can not get the acceleration in onnxruntime, my project will fail.
System information
onnxruntime installed from pip, 1.5.2
pytorch 1.7
Describe the bug I do some optimization in my pytorch code, and got ~5x acceleration in pytorch inference, but when I export the model to onnx format and infer it in onnxruntime, I can not get the acceleration. the optimization is like: do less calculation in some case. actually, it is a cache mechanism, reuse some of the result.
Urgency if it can not get the acceleration in onnxruntime, my project will fail.
System information onnxruntime installed from pip, 1.5.2 pytorch 1.7
To Reproduce a forward function in my model:
def forward(self, img, mem=None): 221 ┆ count_pred_head = 0 222 ┆ pred_phones = [] 223 ┆ for idx, conv in enumerate(self.backbone): 224 ┆ ┆ if idx in self.reuse_meta["reuse_layer"]: 225 ┆ ┆ ┆ # print(img.shape) 226 ┆ ┆ ┆ img = img[:, :, :, -self.reuse_meta["recompute_size"][idx]:]
227 ┆ ┆ ┆ img = conv(img, True) 228 ┆ ┆ ┆ img = torch.cat([ 229 ┆ ┆ ┆ ┆ mem[:, idx, :, :, 25:25 - self.reuse_meta["update_size"][idx]], 230 ┆ ┆ ┆ ┆ img[:, :, :, -self.reuse_meta["update_size"][idx]:] 231 ┆ ┆ ┆ ], 232 ┆ ┆ ┆ ┆ ┆ ┆ ┆ dim=3) 233 ┆ ┆ ┆ mem[:, idx, :, :, :] = img 234 ┆ ┆ else: 235 ┆ ┆ ┆ img = conv(img, mem is not None) 236 ┆ ┆ ┆ if isinstance(conv, BasicBlock): 237 ┆ ┆ ┆ ┆ img = img[:, :, :, 2:] 238 ┆ return img, mem
Expected behavior got same acceleration in onnxruntime as in pytorch
Additional context why, is there something different in onnx reference mechanism?