Leon-Sander / Local-Multimodal-AI-Chat

GNU General Public License v3.0
124 stars 73 forks source link

pdf chat problem #10

Closed hcy5561 closed 4 days ago

hcy5561 commented 7 months ago

ı have an problem when working pdf chat

TypeError: load_retrieval_chain() missing 1 required positional argument: 'vector_db' Traceback: File "C:\Users\hcy53\AppData\Local\Programs\Python\Python310\lib\site-packages\streamlit\runtime\scriptrunner\script_runner.py", line 535, in _run_script exec(code, module.dict) File "C:\Users\hcy53\local_multimodal_ai_chat-main\app.py", line 151, in main() File "C:\Users\hcy53\local_multimodal_ai_chat-main\app.py", line 131, in main llm_chain = load_chain() File "C:\Users\hcy53\AppData\Local\Programs\Python\Python310\lib\site-packages\streamlit\runtime\caching\cache_utils.py", line 212, in wrapper return cached_func(*args, *kwargs) File "C:\Users\hcy53\AppData\Local\Programs\Python\Python310\lib\site-packages\streamlit\runtime\caching\cache_utils.py", line 241, in call return self._get_or_create_cached_value(args, kwargs) File "C:\Users\hcy53\AppData\Local\Programs\Python\Python310\lib\site-packages\streamlit\runtime\caching\cache_utils.py", line 268, in _get_or_create_cached_value return self._handle_cache_miss(cache, value_key, func_args, func_kwargs) File "C:\Users\hcy53\AppData\Local\Programs\Python\Python310\lib\site-packages\streamlit\runtime\caching\cache_utils.py", line 324, in _handle_cache_miss computed_value = self._info.func(func_args, **func_kwargs) File "C:\Users\hcy53\local_multimodal_ai_chat-main\app.py", line 35, in load_chain return load_pdf_chat_chain() File "C:\Users\hcy53\local_multimodal_ai_chat-main\llm_chains.py", line 45, in load_pdf_chat_chain return pdfChatChain() File "C:\Users\hcy53\local_multimodal_ai_chat-main\llm_chains.py", line 55, in init self.llm_chain = load_retrieval_chain(llm, vector_db)

Leon-Sander commented 7 months ago

Just commited a fix, check the newest version of the code out please, should work now.

hcy5561 commented 7 months ago

Thanks. I will try it. I want to ask some questions: 1-I want to run these models faster from my local GPU. Should I make an extra setting?

2-Can I ask questions and get answers in Turkish for the PDF content I uploaded that is in English? How?

Thanks. Best regards.

Leon-Sander commented 7 months ago
  1. In the config file change the number for gpu_layers. With that you have the chat model and the pdf chat on gpu. For the local whisper model for audio transcription you would have to set the device to cuda. I am not sure for the llava model, but probably very similar.
  2. You could create a prompt template and tell the model to answer in Turkish.
INACSistemas commented 6 months ago

I decided to write to you because I thought your application was really cool, we have tried in various ways to use it in a process and we are not succeeding. We installed it and everything went fine, but when it came to uploading the PDF, there was a very long delay, when we uploaded a one-line PDF it worked. We think it's a detail that we can't identify. Another thing we need to know is if some kind of special machine is needed, with a special GPU or something like that, or is it to work on a simple VPS CLOUD. I would really like to use your code and that's why I ask for your help.

Leon-Sander commented 6 months ago

@INACSistemas generally speaking, the faster the gpu, the faster the results. If you're working with pdf chatting on cpu, you might need to wait a long time for a response, since there are a lot of calculations that have to be done. When you tried the pdf with one line, you got your answer because it was faster.

I just updated the code with a pdf chat chain which is a little faster. With the current settings in the config.yaml file, when I change the gpu_layers to 32, it fits on my rtx 3070 which has 8gb of vram. The PDF chat answers in a minute. If you run it on cpu, you might need to wait 5 minutes, or maybe even longer, depending on the number of cpu cores you have available.

You could also reduce the chunk_size of the RecursiveCharacterTextSplitter to 1000 and have smaller documents which get returned from the vector database, and therefore increase the response time by using less input tokens.

INACSistemas commented 6 months ago

Let me ask you, if I want another language, how should I proceed?

INACSistemas commented 6 months ago

Sorry for so many questions, but it is necessary. I have a need to store some PDFs for one project and other PDFs for another project, so that each project has its own access or is multidesk. If this is not possible, I can direct the question to a specific PDF. Does the PDF have a fixed address? Example PDF1.pdf, is it at xxx.com/pdf1/rf2dg554rfsse or something like that? Or will we have to have a proprietary machine for each project?