Unable to replicate DeepLabV3 Pytorch Tutorial numbers

quic / aimet-model-zoo

Other

295 stars 53 forks source link

Unable to replicate DeepLabV3 Pytorch Tutorial numbers #9

Open LLNLanLeN opened 3 years ago

LLNLanLeN commented 3 years ago

I've been working through the DeepLabV3 Pytorch tutorial, which can be founded here: https://github.com/quic/aimet-model-zoo/blob/develop/zoo_torch/Docs/DeepLabV3.md.

However, when running the evaluation script using optimized checkpoint, I am unable to replicate the mIOU result that was listed in the table. The number that I got was 0.67 while the number reported by Qualcomm was 0.72. I was wondering if anyone have had this issue before and how to resolve it ?

quic-dkhullar commented 3 years ago

Hi, we have used tf_enhanced Quantization scheme and a batch size of 8 to get to 72.04 % accuracy. Could you please rerun with these params?

LLNLanLeN commented 3 years ago

@quic-dkhullar I've been running with those default quantization parameters (which is the default in the deeplabV3 eval script) without any additional modifications from my side. The problem is that after it perform compute_encoding, that's when the accuracy drop from 72% to 67%. If remove everything starting from compute_encoding and after, then it stays at 72%. But without performing compute_encoding, is it no longer a quantized model, but just a float model.

https://github.com/quic/aimet-model-zoo/blob/develop/zoo_torch/examples/eval_deeplabv3.py#L126

quic-dkhullar commented 3 years ago

It seems you are still running with the default parameters for the API. These parameters will produce different results. Please set the parameters to use quant-scheme tf_enhanced and batch size 8.

More information here: aimet-model-zoo/DeepLabV3.md at develop · quic/aimet-model-zoo (github.com)