An error occurred while generating the response: You can't move a model that has some modules offloaded to cpu or disk.

PromtEngineer / localGPT-Vision

Chat with your documents using Vision Language Models. This repo implements an End to End RAG pipeline with both local and proprietary VLMs

334 stars 65 forks source link

An error occurred while generating the response: You can't move a model that has some modules offloaded to cpu or disk. #1

Open zono50 opened 2 weeks ago

zono50 commented 2 weeks ago

Yeah, i have 12 GB of Vram, trying to use the 7b qwen vision model, and it keeps giving me this error message, "An error occurred while generating the response: You can't move a model that has some modules offloaded to cpu or disk". I never see my VRAM go over 35%, so not sure if it's doing an estimated calculation then refusing, or is something else going on here?

PromtEngineer commented 2 weeks ago

It might be related to the VRAM. I would suggest to try the 2B qwen version. You will just need to make a small change for that to work. In this section, change the Qwen/Qwen2-VL-7B-Instruct to Qwen/Qwen2-VL-2B-Instruct. There are two places where you have to make the changes.

zono50 commented 2 weeks ago

i'm also running out of vram simply trying to index files. Is there a work around for that? i've closed the programs and see i have about 10 GB of free vram, but if i try to index more than one file, i run out of vram as well. I am currently downloading the 2B qwen model and will let you know as soon as it's finished how my progress went.

I changed it everywhere I could see, but still getting this error in the terminal - 2024-10-08 11:51:21,622 - ERROR - main - RAG model not found for session 375a37bd-f329-4d82-9bcf-b0ccb534c2ca

I changed it in both places under model_loader.py and settings.html. I even went into the github and copied and pasted your mode_loader.py file just in case i missed something, same error. Happens in another browser and incognito mode as well.

EDIT - Reinstalled it and now it seems to be working. Are there any plans for mass indexing? or is this generally upload a few files at a time and then navigate from there.

zono50 commented 2 weeks ago

Also seem to be having session issues, if i create a 2nd session, and try to name it, it says "Error naming session, session not found"

gschleusner commented 2 weeks ago

When I monitor my GPU I'm seeing no activity at all, but get this same error.

PromtEngineer commented 1 week ago

@gschleusner can you confirm if that model is actually being loaded on the GPU (you can add the device to the logs here. You can check the app.log for logs. It might be that the environment is not seeing your GPU for some reasons.

PromtEngineer commented 1 week ago

@zono50 thanks for reporting the bugs. I will look at the renaming issue. We can potentially implement a api for indexing a large number of documents.

zono50 commented 1 week ago

I couldn't seem to get the qwen 2B to work, could you implement that as one of the models that comes with your program?

ioannis-papadimitriou commented 1 week ago

@zono50 thanks for reporting the bugs. I will look at the renaming issue. We can potentially implement a api for indexing a large number of documents.

First of all, well done; secondly, in addition to the renaming I encountered an issue with the delete session - clicking the button doesn't do anything.

metantonio commented 1 week ago

one question about this, should i download the model or byaldi will download it automatically?

PromtEngineer commented 1 week ago

@metantonio byaldi will automatically download that for you.