alasdairforsythe / tokenmonster

Ungreedy subword tokenizer and vocabulary trainer for Python, Go & Javascript
MIT License
528 stars 20 forks source link

Wrapping lib in a go cli client #14

Closed 101313 closed 11 months ago

101313 commented 11 months ago

Greetings, there are many situations where relying on curl is enough for utilizing inference on otherwise not very capable machines where a python or js interpreter isn't always assumed.

In case it is of interest in regards to you, I would like to kindly request that you wrap the go library into a very basic CLI which would allow to leverage the library in a small self contained go executable. I'm really referring the simplest, bare minimum client, no need for thinking about interactive fuss like completions etc.

I'd normally go ahead and just do it but I never worked with go and dedicating a couple of days to get myself onboard is more or less not an option as of now. I sincerely hope this is something you've thought about doing at some point and you're interested and comfortable enough to make it happen, with the least possible effort without giving up any significant amount of your otherwise precious time =)

I'm looking forward to your answer, feel free to turn it down without any hesitation if you think its appropriate to do so, its totally understandable in any case! Thanks

alasdairforsythe commented 11 months ago

TokenMonster doesn't do inference, it converts text into an array of integers (token IDs). A CLI would output what, exactly? Token IDs as a serialized bytes string, surely. How is that useful?

101313 commented 11 months ago

Yes I see what you mean. I mistakenly thought that tokens encoding as such would be translatable to cl100k which obviously isn't the case.

Feel free to close the issue, also I read a couple of titles from your blog and you're a big chad. wish you the best =)