I have installed the app, and it works great on my Nexus 5X! Except that it is pretty slow.
Ideally, I would like the model to generate words faster or as fast as I can read them.
Therefore, would you be able to add support for a model which is based on distilgpt2, but with FP16 quantization and a sequence length of, say 32?
I realise that you have supplied the code to create one's own models within this app, but try as I might, the models that I create using gpt2.py keep failing to work when I add them to the app.
Is there any chance of you adding an extra model, as described above, that is as fast as possible, to the download.gradle?
I have installed the app, and it works great on my Nexus 5X! Except that it is pretty slow.
Ideally, I would like the model to generate words faster or as fast as I can read them.
Therefore, would you be able to add support for a model which is based on distilgpt2, but with FP16 quantization and a sequence length of, say 32?
I realise that you have supplied the code to create one's own models within this app, but try as I might, the models that I create using gpt2.py keep failing to work when I add them to the app.
Is there any chance of you adding an extra model, as described above, that is as fast as possible, to the
download.gradle
?Thanks