sign / translate

Effortless Real-Time Sign Language Translation
https://sign.mt
Other
414 stars 74 forks source link

[Feature] Replace text normalization with local model #163

Open AmitMY opened 4 weeks ago

AmitMY commented 4 weeks ago

Problem

At the beginning of the spoken-to-signed translation pipeline, we perform multiple tasks, an important one of which is text normalization.

image

Unlike the others, which run completely offline, text normalization relies on an online-only solution which can degrade performance when offline, or create small delays when running online. Ideally, for privacy concerns, we would also like to move this endpoint to a local model. Furthermore, it costs us money to run this API endpoint, calling GPT-3 to automatically normalize the text.

Description

Seems like every large company is pushing for local small LLMs, with limited world knowledge but superb text processing abilities. For example, Google is pushing Gemini Nano in chrome (experimental API): https://x.com/rauchg/status/1806385778064564622 https://developer.chrome.com/docs/ai/built-in

If this API ever reaches production, we should prompt it instead of prompting ChatGPT.

Alternatives

Train our own normalization model on existing text normalization data, or collect data using ChatGPT. Training our own model would take away resources from our main objective, and will require the user to host another model on their device (which is undesirable).