Open nekomeowww opened 1 year ago
@nekomeowww I am interested to work on this issue.
However, maybe some questions before getting started.
Would you mind to explain more about these terms:
User infos
Standalone user context
Within prompt
Thanks.
Thank you!
Yes you can pick this issue and work on it, but I am afraid of it might be too hard to understand and make changes to OpenAI prompt in pkg/openai/prompt.go
for you due to it was written in Simplified Chinese. You will also need a valid OpenAI account to do the prompt engineering.
This issue is meant to improve the chat histories summarization feature (aka. recap), the mean goal is to reduce the token usage of OpenAI prompt.
Before dive into this feature, please allow me to summarize how recap works. There is a middleware called RecordMessage, it will extract and format the chat messages coming from Telegram and then store them into Postgres. When user send the /recap command or it was the time to send a automatic recap message to chat groups (implemented in internal/services/auto recap
), the chathistories models will format the messages into the following pattern (implementation at https://github.com/nekomeowww/insights-bot/blob/8665befe035c9d05b19e1ed590ddb3790747255c/internal/models/chathistories/chat_histories.go#L235):
msgId:1 UserName1 sent: ```Hello!```
msgId:2 UserName2 replying to [UserName1 sent msgId:1]: ```Hello! How are you today?```
And then it will inject the pattern into OpenAI prompt template for OpenAI's GPT-3.5 model to summarize the chat histories for us.
You may find out the UserName1 and UserName2 is explicitly stated each time they appears. This can be inefficient and use a lot of tokens for prompt when multiple users appears multiple times when they chatted in a same group.
Therefore I came up with a idea: why don't we aggregate the usernames appeared in the chat histories, and then place a formatted username map before the chat histories, additionally, use userId or array index to represent the username just like this:
Users:"""
1: UserName1
2: UserName2
...
10: UserName10
"""
Chat histories:"""
msgId:1 user:1 sent: """Hello!"""
msgId:2 user:2 replying to [user:1 sent msgId:1]: """Hello! How are you today?"""
msgId:3 user:3 sent: """Nice to meet you guys!"""
msgId:4 user:10 sent """The party is just about to start!!!"""
"""
The terms can be explained now.
Users:
in the above example, it holds the aggregated user names and number it represented for users.pkg/openai/prompt.go
I think the best way to let you jump into this issue is to wait for me to implement the i18n support we talked about previously (#67), and then we can use English to write the OpenAI prompt for better understanding. How is that?
Yes you can pick this issue and work on it, but I am afraid of it might be too hard to understand and make changes to OpenAI prompt in pkg/openai/prompt.go for you due to it was written in Simplified Chinese. You will also need a valid OpenAI account to do the prompt engineering
OK. No worries.
Thanks for the explanation.
Do you have recommendation of issues that I can help? Maybe something that needs to be done sooner and requires less dependency (e.g. credentials).
Yes you can pick this issue and work on it, but I am afraid of it might be too hard to understand and make changes to OpenAI prompt in pkg/openai/prompt.go for you due to it was written in Simplified Chinese. You will also need a valid OpenAI account to do the prompt engineering
OK. No worries.
Thanks for the explanation.
Do you have recommendation of issues that I can help? Maybe something that needs to be done sooner and requires less dependency (e.g. credentials).
What do you mean by credentials
?
TBH, there aren't any simple and easy issues or ongoing future issues with less dependency for you to work with, due to the lacked support of i18n, and insights-bot is a project that relies on languages and GPT models, we initially developed it with Chinese support only. You may have to wait for us to support i18n.
What do you mean by credentials?
Something like access to GPT products or the telegram bot.
You may have to wait for us to support i18n.
Ok no worries. Thanks.
@nekomeowww I am interested to work on this issue.
However, maybe some questions before getting started.
Would you mind to explain more about these terms:
Thanks.