huggingface / tflite-android-transformers

DistilBERT / GPT-2 for on-device inference thanks to TensorFlow Lite with Android demo apps
Apache License 2.0
391 stars 81 forks source link

Would like support for a superfast model for gpt2 #6

Open Peter-Devine opened 4 years ago

Peter-Devine commented 4 years ago

I have installed the app, and it works great on my Nexus 5X! Except that it is pretty slow.

Ideally, I would like the model to generate words faster or as fast as I can read them.

Therefore, would you be able to add support for a model which is based on distilgpt2, but with FP16 quantization and a sequence length of, say 32?

I realise that you have supplied the code to create one's own models within this app, but try as I might, the models that I create using gpt2.py keep failing to work when I add them to the app.

Is there any chance of you adding an extra model, as described above, that is as fast as possible, to the download.gradle?

Thanks