oobabooga / text-generation-webui-extensions

561 stars 108 forks source link

Update README.md #52

Closed RandomInternetPreson closed 11 months ago

RandomInternetPreson commented 11 months ago

Hello, this is an extension I built that let's one use the internet with their LLM. It's designed to work on generic LLMs not fine-tuned on certain commands. The user prefaces their input with commands, data are gathered, the LLM inspects the data, and fulfills the request of the user.

Integrated into the workflow is the nougat OCR model: https://github.com/facebookresearch/nougat

The user can turn the OCR on an off.

The LLM does an initial google search, collects the links, then can pick from those links or let the user guide them on which and how many links to choose. The set of interactions are represented like this:

"search" > "additional links" > "please expand"

Upon please expand each web page will either be:

  1. Printed to a pdf and the contents of the pdf collected into a text file that is eventually sent back to the LLM, in this case hyperlinks and text from the pdf are collected and stored in two different txt files.

  2. If the hyperlink is a .pdf file (ends in .pdf) it will undergo 1 of 2 processes: 2a. sent to the same python code that scans the printed .pdf file, and parses text and hyperlinks the same way 2b. sent to the OCR model where a .mmd file is generated, and those data are sent to the LLM (no hyperlinks are parsed)

  1. Printed and sent to the OCR model (hyperlinks are parsed). This is useful if you need to scan web pages with mathematic and scientific symbols.

Additionally there is a "go to" command that will send the LLM to a specific website or set of websites.

https://github.com/RandomInternetPreson/LucidWebSearch

oobabooga commented 11 months ago

I hadn't seen your PR, sorry! The extension looks very nice, thank you for submitting it.

RandomInternetPreson commented 11 months ago

No worries :3 I really appreciate the work you put into the textgen webui and I'll patiently wait. I submitted an issue on the Nougat github and asked about a --cpu only mode, I think it will eventually get incorporated but I was given a tip on how to force it to happen. I'm gonna incorporate that into the code for now, so people don't need to use any extra vram.

Thanks for the feedback!