When testing the inference performance of MODNet and compare to SGHM, the inference for SGHM is 2x slower than MODNet as opposed to 1.7x faster as claimed in the paper.
How can I reproduce the 1.7x speedup mentioned in the paper? I benchmarked this against CUDAExecutionProvider
Description
When testing the inference performance of MODNet and compare to SGHM, the inference for SGHM is 2x slower than MODNet as opposed to 1.7x faster as claimed in the paper.
How can I reproduce the 1.7x speedup mentioned in the paper? I benchmarked this against
CUDAExecutionProvider