huggingface / tflite-android-transformers

DistilBERT / GPT-2 for on-device inference thanks to TensorFlow Lite with Android demo apps
Apache License 2.0
391 stars 81 forks source link

Support BERT finetuned on SQuAD #1

Open Pierrci opened 4 years ago

Pierrci commented 4 years ago

Since the tokenizer is the same than MobileBERT/DistilBERT, would be pretty straightforward to add once this TensorFlow issue is solved: https://github.com/tensorflow/tensorflow/issues/34210

ucalyptus commented 4 years ago

@Pierrci is this closing now as 34210 is closed?

Pierrci commented 4 years ago

The non-quantized TFLite version is around 1GB so way too big for a mobile app. I'll close this once the FP16 quantization works so we can use a model with a reduced size and good performance, but it's not the case for now (at least when I tried last Friday)

csarron commented 4 years ago

Hi @Pierrci, were you able to quantize the BERT for TFLite? I tried a few options but failed to get the quantized model.