c0sogi / LLMChat

A full-stack Webui implementation of Large Language model, such as ChatGPT or LLaMA.
MIT License
257 stars 45 forks source link

Token conservation #28

Closed Torhamilton closed 1 year ago

Torhamilton commented 1 year ago

I propose we use a two LLMs approach to cut them on the cost of using gpt4 and all expensive future variants.

This mostly applies if you are using GTP4, but why use anything else :)

You have:

  1. Initial prompt
  2. summary
  3. last question

This may even get gpt4 to be more focus and on point

c0sogi commented 1 year ago

I added the feature to perform summarization in the background for messages larger than 512 tokens. When summarization is finished, the result is added to the MessageHistory, and when sent to llm, the summarized text is sent instead of the original text. The token threshold can be changed in ChatConfig.

5b2d56f0ba18ac65cc3b453bb4830096bc7a6187

Torhamilton commented 1 year ago

Perfect!