Node and Browser env supported WebAssembly version of fastText: Library for efficient text classification and representation learning.
WebAssembly version of fastText(archived) with compressed lid.176.ftz
model (~900KB) and a typescript wrapper. This project focuses on cross-platform, zero-dependency and out-of-the-box.
In Node.js, you should use this approach for binding js best performance.
import { getLIDModel } from 'fasttext.wasm.js'
const lidModel = await getLIDModel()
await lidModel.load()
const result = await lidModel.identify('Hello, world!')
console.log(result.alpha2) // 'en'
In others environments, use like below:
import { getLIDModel } from 'fasttext.wasm.js/common'
const lidModel = await getLIDModel()
// Default paths:
// {
// wasmPath: '<globalThis.location.origin>/fastText/fastText.common.wasm',
// modelPath: '<globalThis.location.origin>/fastText/models/lid.176.ftz',
// }
await lidModel.load()
const result = await lidModel.identify('Hello, world!')
console.log(result.alpha2) // 'en'
Do not forget that download and place /fastText/fastText.common.wasm
and /fastText/models/lid.176.ftz
in public root directory. You can override the default paths if necessary.
Dataset papluca/language-identification/test accuracy test result in Node.js runtime:
Name | Error Rate | Accuracy | Total |
---|---|---|---|
fastText | 0.02 | 0.98 | 10000 |
cld | 0.04 | 0.96 | 10000 |
eld | 0.06 | 0.94 | 10000 |
languageDetect | 0.24 | 0.76 | 10000 |
franc | 0.27 | 0.73 | 10000 |
Bench Test
task for accuracy testBench
task for benchmark testor
pnpm i
pnpm run build
cd bench
pnpm run test
for accuracy testpnpm run bench
for benchmark testPay attention, add source ./emsdk_env.sh
to shell profile to auto load emsdk env, and export EMSDK_QUIET=1
can be used to suppress these messages.
npm run build
npx changeset
npx changeset version
git commit
npx changeset publish
git push --follow-tags