simbake / web_search

web search extension for text-generation-webui
GNU General Public License v3.0
91 stars 19 forks source link

Web_search showing links instead of google search results #1

Closed iChristGit closed 1 year ago

iChristGit commented 1 year ago

Output generated in 2.47 seconds (3.25 tokens/s, 8 tokens, context 28, seed 295305407) Traceback (most recent call last): File "D:\oobabooga_windows\installer_files\env\lib\site-packages\gradio\routes.py", line 427, in run_predict output = await app.get_blocks().process_api( File "D:\oobabooga_windows\installer_files\env\lib\site-packages\gradio\blocks.py", line 1323, in process_api result = await self.call_function( File "D:\oobabooga_windows\installer_files\env\lib\site-packages\gradio\blocks.py", line 1067, in call_function prediction = await utils.async_iteration(iterator) File "D:\oobabooga_windows\installer_files\env\lib\site-packages\gradio\utils.py", line 336, in async_iteration return await iterator.anext() File "D:\oobabooga_windows\installer_files\env\lib\site-packages\gradio\utils.py", line 329, in anext return await anyio.to_thread.run_sync( File "D:\oobabooga_windows\installer_files\env\lib\site-packages\anyio\to_thread.py", line 31, in run_sync return await get_asynclib().run_sync_in_worker_thread( File "D:\oobabooga_windows\installer_files\env\lib\site-packages\anyio_backends_asyncio.py", line 937, in run_sync_in_worker_thread return await future File "D:\oobabooga_windows\installer_files\env\lib\site-packages\anyio_backends_asyncio.py", line 867, in run result = context.run(func, args) File "D:\oobabooga_windows\installer_files\env\lib\site-packages\gradio\utils.py", line 312, in run_sync_iterator_async return next(iterator) File "D:\oobabooga_windows\text-generation-webui\modules\chat.py", line 328, in generate_chat_reply_wrapper for i, history in enumerate(generate_chat_reply(text, state, regenerate, _continue, loading_message=True)): File "D:\oobabooga_windows\text-generation-webui\modules\chat.py", line 313, in generate_chat_reply for history in chatbot_wrapper(text, state, regenerate=regenerate, _continue=_continue, loading_message=loading_message): File "D:\oobabooga_windows\text-generation-webui\modules\chat.py", line 205, in chatbot_wrapper text = apply_extensions('input', text, state) File "D:\oobabooga_windows\text-generation-webui\modules\extensions.py", line 207, in apply_extensions return EXTENSION_MAP[typ](args, **kwargs) File "D:\oobabooga_windows\text-generation-webui\modules\extensions.py", line 63, in _apply_string_extensions text = func(text) File "D:\oobabooga_windows\text-generation-webui\extensions\web_search\script.py", line 38, in input_modifier if internet: NameError: name 'internet' is not defined

Any idea how to fix? :D

simbake commented 1 year ago

Sorry about this, I had pushed an experimental version of the script. Now it should be fixed, I have updated the script to a working version.

iChristGit commented 1 year ago

Sorry about this, I had pushed an experimental version of the script. Now it should be fixed, I have updated the script to a working version.

thanks for the reply, the initial issue is resolved, but it seems like any question I ask gives me the actual site address (accueweather.com etc) When I uncheck the option the response becomes generic so I know it does work in some way, but wont give current wheater info or scores for last games.

simbake commented 1 year ago

I haven't tested with some model, these are the models I used:

Which model are you using? Maybe needs some fine-tuning for different models.

iChristGit commented 1 year ago

My used model is thebloke/llama-2-13b-chat-GPTQ

simbake commented 1 year ago

If you are able, a screenshot might help also.

iChristGit commented 1 year ago
  • Llama 2 7B

This is the logs with -verbose with Vicuna-33B

https://www.israelweather.co.il/forecast/index_english.html https://www.accuweather.com/en/il/tel-aviv/215854/weather-forecast/215854 https://weather.com/weather/tenday/l/Tel+Aviv+Tel+Aviv+District+Israel?canonicalCityId=f156cd0dd9268088b98f5de72a0e74a26c2645748ebdd593c3b21f939014c29e https://www.weather25.com/asia/israel?page=today https://ims.gov.il/en Assistant:

Output generated in 16.26 seconds (12.30 tokens/s, 200 tokens, context 265, seed 574761213)

Screenshot 2023-08-03 213426

simbake commented 1 year ago

maybe can you try changing to chat-instruct mode.

For me it works ai_response

simbake commented 1 year ago

For now I will download that model and hunt down the problem. Might take a while, I have slow internet. I will get back to you.

iChristGit commented 1 year ago

For now I will download that model and hunt down the problem. Might take a while, I have slow internet. I will get back to you.

I messed around with it more using various models, and it seems to mostly work but sometime it still hallucinate. Like I can see in the -verbose cmd that the context url is correct (en.wikipedia for example) but then it has like 3 URLS that are not really helping, is there a way to let it only pick up one source instead of the 4 it usually does?

simbake commented 1 year ago

Yes, you can open script.py in web_search and search for num_results=3 and change it to 1 or 0, or any number of results you like the model to receive. Hope this helps.

simbake commented 1 year ago

You can also try using it with dynamic_contect extension as it still thinks the date and time is behind.

I downloaded the model but still can't seem to be able to reproduce the issue of only showing links. Bear with me, I am still new to python.

iChristGit commented 1 year ago

Thank you for all the help! Il check it out later, great work. I hope that with 1 source it will give true information (for example search shrek 2 plot should give a summarize of the wikipedia information), but sometimes it just makes up facts although it has the sources right, maybe its the model/settings, but Il need to mess with it more!

simbake commented 1 year ago

Thanks, you can also try experimenting with commands such as search shrek 2 plot summary wikipedia. Hope this helps reduce hallucination issues. Try being specific as if talking to the bot and searching google at the same time. ai_response2

iChristGit commented 1 year ago

Did you try regenrating a couple of times? When I try something very specific (lets say : Death of Brian Wells on wikipedia) It will actually give the right source, but sometimes hallucinate so much information. Like saying the kidnappers never got caught, or saying they used a normal bomb. etc.

simbake commented 1 year ago

No I did not, I will continue working on the script to see if I can reduce the hallucinations.

iChristGit commented 1 year ago

No I did not, I will continue working on the script to see if I can reduce the hallucinations.

any leads yet? maybe a model that I should run with that you noticed has less hallucinations? il gladly report anything I disover.

simbake commented 1 year ago

Hi, I will upload a new script tomorror, today I had to rest.I will notify you when I do then you can test.

iChristGit commented 1 year ago

Great news! thank you for your work

simbake commented 1 year ago

Hi, checkout the new push. The main issue is it produces large prompts that result in large context for models. I will try to find a fix for the problem.

iChristGit commented 1 year ago

I've played around with the new version, it seems solid. I almost always get correct information now! But whenever the context in the console is just the url, does that mean it will just know the actual url and not the info on the site? Solid addition to my webui, it should be an official built in feature of the webui imo!

simbake commented 1 year ago

Yes, sites are build diffrenlty. When it only shows links means no data was scraped and provided to the model, only the links. This is one of the challenges I faced, having to scrap sites without knowing the site structure and pick what I want and ensuring the data is low so that it doesn't go beyond the context size. Maybe do a summary of the scraped data before presenting it to the model, this is the idea I have and I am looking into it. It feels like I am building a 'self conscious' Google :)

iChristGit commented 1 year ago

Ive posted the repo to the main oogabooga discussion and on reddit! hopefully it can be on the official list, and worked on to have more context and features! Your extension made LLMs that much better compared to services like bing chat, thanks!

simbake commented 1 year ago

You are welcome.