This library enables you to run huggingface transformer models directly in the browser. It accomplishes this by running the models using the ONNX Runtime JavaScript API and by implementing its own JavaScript-only tokenization library.
At the moment, it is compatible with Google's T5 models, but it was designed to be expanded. I hope to support GPT2, Roberta, and InCoder in the future.
https://transformers-js.praeclarum.org
This demo is a static website hosted on Azure Static Web Apps. No code is executed on the server. Instead, the neural network is downloaded and executed in the browser.
See the Makefile demo
rule to see how the demo is built.
This example shows how to use the library to load the T5 neural network to translate from English to French.
// Load the tokenizer and model.
const tokenizer = await AutoTokenizer.fromPretrained("t5-small", "/models");
const model = await AutoModelForSeq2SeqLM.fromPretrained("t5-small", "/models");
// Translate "Hello, world!"
const english = "Hello, world!";
const inputTokenIds = tokenizer.encode("translate English to French: " + english);
const outputTokenIds = await model.generate(inputTokenIds, {maxLength:50,topK:10});
const french = tokenizer.decode(outputTokenIds, true);
console.log(french); // "Bonjour monde!"
To run this demo, you need to have converted the model to ONNX format using the Model Conversion Tool.
python3 tools/convert_model.py t5-small models
The library contains several components:
Currently only the T5 network is supported.
The neural network outputs the logarithm of the probability of each token. In order to get a token, a probabilistic sample has to be taken. The following algorithms are implemented:
The ONNX Runtime for the Web is used to run models in the browser.
You can run the conversion from the command line:
python3 tools/convert_model.py <modelid> <outputdir> <quantize> <testinput>
For example:
python3 tools/convert_model.py praeclarum/cuneiform ./models true "Translate Akkadian to English: lugal"
Or you can run it from Python:
from convert_model import t5_to_onnx
onnx_model = t5_to_onnx("t5-small", output_dir="./models", quantized=True)
Developer Note: The model conversion script is a thin wrapper over the amazing fastT5 library by @Ki6an. The wrapper exists because I hope to support more model types in the future.