Leon-Sander / Local-Multimodal-AI-Chat

GNU General Public License v3.0
136 stars 83 forks source link

pdf load time is slow #28

Closed ItsIgnis closed 1 month ago

ItsIgnis commented 4 months ago

when i try to load a book in pdf format , size of book is around 6.5mb .i waited for more than 90min still it shows processing . can u please help me to fast the loading speed of pdf .

Paramjethwa commented 1 month ago

yes even i am having this issue, i have uploaded a pdf file of 1.5 MB and its been 10 minute and still the its on 10% processing, please anyone with the solution i have gpu 4050 with latest intel gen cpu

Leon-Sander commented 1 month ago

Thats really weird, are you on windows? I am on linux and a 5mb pdf file takes 2-5 seconds to process.

I am updating the code in the coming days to be based on ollama, which should improve the speed a lot.

JTMarsh556 commented 1 month ago

It has been a while since I have used this application so I may be remembering this incorrectly but IIRC it has to do with not having pytorch installed. A lot of embedding options are significantly improved with GPU support and IIRC, for me, it required having pytorch installed in that env. Use the section at the top to adjust for your specific needs and use the command. Again, it has been a while but I hope this helps.

https://pytorch.org/get-started/locally/

Thank you Leon, I learned a lot from you.

JTMarsh556 commented 1 month ago

Thats really weird, are you on windows? I am on linux and a 5mb pdf file takes 2-5 seconds to process.

I am updating the code in the coming days to be based on ollama, which should improve the speed a lot.

I think that will improve speed for response generation but I believe my ingestion issue was related to not getting GPU engagement on embedding the documents. I think that was pytorch related. I know it was on a few things I built and I think that was the case here as well but I could be wrong. It is all a bit of a blur now

Paramjethwa commented 1 month ago

Thats really weird, are you on windows? I am on linux and a 5mb pdf file takes 2-5 seconds to process. I am updating the code in the coming days to be based on ollama, which should improve the speed a lot.

I think that will improve speed for response generation but I believe my ingestion issue was related to not getting GPU engagement on embedding the documents. I think that was pytorch related. I know it was on a few things I built and I think that was the case here as well but I could be wrong. It is all a bit of a blur now

i have installed this and it say requirement already satisfied but no luck with use of Gpu even in my task manager performance there 0 percent GPU usage during running the project

Leon-Sander commented 1 month ago

The old version of the code, which is now on the "ctransformers" branch, was not using gpu for embedding documents, only for chatting with them if you had GPU enabled.

The new version on the main branch works with ollama api, if you have gpu enabled, then embeddings will also be created on the gpu.

Now keep in mind, that a large pdf file with a lot of text, will still take its time to process. Also it seems that windows in general is slower than linux.