Use mozilla/readability to extract the text content of webpages

I would like the extension to include the websites content, not only the headlines. The package https://github.com/mozilla/readability allows you to extract the text from a website for readability (this is what Firefox uses on it's "reader mode"). This may simplify the process of extracting the texts from the websites, but obviously you will get to the problem of token limit for a message.

After reading other issues, I think that if you divide the process of accessing the web in various steps/prompts you might be able to avoid that limit by separating the results in multiple messages.

A couple of days ago, I tried something like this:

You are now in "Text Ingestion Mode".

When I send you a message, reply with '...'.
If I send you the string "EOM", exit "Text Ingestion Mode".

At first it worked and responded with "..." after the first text to ingest, but after the second text, it jumped to conclusions and tried to give an opinion about the text I provided. I think this is just a problem with the initial prompt and after some tweaks it should work most of the times.

So what I imagine is:

User:
{prompt}

GPT:
Welcome to WebGPT [...] How can I help you?

User:
{query}

GPT:
SEARCH: {gpt_generted_query}

User:
Result 1/5
{content of first result's site}

GPT:
Next result.

User:
Result 2/5
{content of first result's site}
GPT:
Next result.

User:
Result 3/5
{content of first result's site}
GPT:
Next result.

User:
Result 4/5
{content of first result's site}
GPT:
Next result.

User:
Result 5/5
{content of first result's site}

GPT:
{gpt_answer_to_query}

interstellard / chatgpt-advanced

Use mozilla/readability to extract the text content of webpages #83