nekomeowww / insights-bot

A bot works with OpenAI GPT models to provide insights for your info flows.
MIT License
248 stars 24 forks source link

fix: correctly truncate or split text by token limit #17

Closed xwjdsh closed 1 year ago

xwjdsh commented 1 year ago

I noticed that the tokenizer may not be used correctly, and also upgrade pandodao/tokenizer-go to v0.2.0 (thank you again) in this PR.

xwjdsh commented 1 year ago

There is a native Go package https://github.com/pkoukk/tiktoken-go and it is better to replace pandodao/tokenizer-go with it. If you think it's ok, I can do it together in this PR.

nekomeowww commented 1 year ago

First of all, thanks for the contribution! And of course, you can replace it with pkoukk/tiktoken-go.

nekomeowww commented 1 year ago

Sorry about the incomplete and still ongoing unit tests, hope we didn't make you confuse and hard to understand the code.

nekomeowww commented 1 year ago

I am going to take over this PR and co-author it with your name since the PR has no further activities for two days.

xwjdsh commented 1 year ago

Sorry I'm late. There are some changes in the coding results of tiktoken-go(same as openai/tiktoken) and tokenizer, so I updated the test cases.

nekomeowww commented 1 year ago

Merged, thank you for helping me!