sparticleinc / chatgpt-google-summary-extension

Chrome extension to view ChatGPT summaries alongside Google search results and YouTube videos, also supports Yahoo! ニュース、PubMed、PMC、NewsPicks、Github、Nikkei、 Bing、Google Patents, and any page summary.
https://glarity.app
GNU General Public License v3.0
1.77k stars 225 forks source link

handling long article summarization in several batches #97

Open ShiganChu opened 1 year ago

ShiganChu commented 1 year ago

Currently the web article summarizer can only handle the first 14000 tokens of article, this is insufficient for long articles. Is it possible to split the article into multiple batches, and perform summarization for each batches?

floatas commented 1 year ago

Yes, it is possible, and I also want fix for this. In theory there is no limit of text length.

Send messages with this template and after all batches are sent and processed, send a summarization message. I will give you text in batches, after each message respond "I read it", remember all text, I will give you question about it later: {{BATCH}}

Tried implementing myself, but I'm not that proficient in react

ShiganChu commented 1 year ago

@givebest can you make the improvement? the change should be made in https://github.com/sparticleinc/chatgpt-google-summary-extension/blob/b2b9536ee061284293b15d1db9f0f76a00433c6f/src/content-script/prompt.ts#L17

givebest commented 1 year ago

We have made improvements in two areas:

  1. set different tokens depending on the model.
    export const modelMaxToken = {
    'gpt-3.5-turbo': 4096.
    'gpt-3.5-turbo-0301': 4096.
    'gpt-4': 8192.
    'gpt-4-0314': 8192.
    'gpt-4-32k': 32768.
    'gpt-4-32k-0314': 32768.
    }
  2. Use langchain.js to implement a summary of all text.

These will be implemented in the new version.

ShiganChu commented 1 year ago

Thanks! both solutions are api based, and they might be costly based on usage frequency. is there plan to handle long article cases based on free webapp?

givebest commented 1 year ago

chatGPT Multiple requests in a short period of time can easily lead to restricted access. Haven't found a better way to do this.