microsoft / aici

AICI: Prompts as (Wasm) Programs
MIT License
1.87k stars 76 forks source link

add TokenSet.numSet/toString and tokenRepr in jsctrl #76

Closed kevinmingtarja closed 4 months ago

kevinmingtarja commented 4 months ago

This PR adds support for TokenSet.numSet/toString and tokenRepr in jsctrl.

Example:

// samples/hello.js

async function main() {
    await $`Ultimate answer is to the life, universe and everything is `
    await gen({ regex: /abc^/ })
}

start(main)

Output:

% ../../aici.sh run --build . samples/hello.js
...
[0]: FIXED "Ultimate answer is to the life, universe and everything is "
[0]: GEN-OPT {regex: /abc^/}
[0]: regex constraint: "abc^"
[0]: dfa: 244 bytes
[0]: ALLOW: TokenSet: 3/50295; "a", "ab", "abc"
[0]: GEN-STEP: "a"
[0]: ALLOW: TokenSet: 2/50295; "b", "bc"
[0]: GEN-STEP: "b"
[0]: ALLOW: TokenSet: 1/50295; "c"
[0]: GEN-STEP: "c"
[0]: ALLOW: TokenSet: 0/50295;
[0]: Constraint doesn't allow any tokens; adding EOS
[0]: GEN-STEP: EOS
[0]: GEN "abc"
[0]: JsCtrl: done
[DONE]
[Response] abc

Closes #64

kevinmingtarja commented 4 months ago

@microsoft-github-policy-service agree

kevinmingtarja commented 4 months ago

Thanks for the review @mmoskal!

This project looks really interesting, I'll be closely following the development, and hope to contribute more in the future :)