float16 quantized DistilBERT model results into performance drop

huggingface / tflite-android-transformers

DistilBERT / GPT-2 for on-device inference thanks to TensorFlow Lite with Android demo apps

Apache License 2.0

392 stars 81 forks source link

float16 quantized DistilBERT model results into performance drop #11

Open sayakpaul opened 4 years ago

sayakpaul commented 4 years ago

@Pierrci

I used the DistilBERT model with the SST-2 dataset for text classification. I then converted the trained model to TensorFlow Lite using float16 quantization. Here's my notebook. Then when I evaluated the float16 TensorFlow Lite model I see a tremendous performance drop (~49% validation accuracy) with respect to the original model. Here's the notebook.

Am I missing out on something?