Open zono50 opened 2 weeks ago
It might be related to the VRAM. I would suggest to try the 2B qwen version. You will just need to make a small change for that to work. In this section, change the Qwen/Qwen2-VL-7B-Instruct
to Qwen/Qwen2-VL-2B-Instruct
. There are two places where you have to make the changes.
i'm also running out of vram simply trying to index files. Is there a work around for that? i've closed the programs and see i have about 10 GB of free vram, but if i try to index more than one file, i run out of vram as well. I am currently downloading the 2B qwen model and will let you know as soon as it's finished how my progress went.
I changed it everywhere I could see, but still getting this error in the terminal - 2024-10-08 11:51:21,622 - ERROR - main - RAG model not found for session 375a37bd-f329-4d82-9bcf-b0ccb534c2ca
I changed it in both places under model_loader.py and settings.html. I even went into the github and copied and pasted your mode_loader.py file just in case i missed something, same error. Happens in another browser and incognito mode as well.
EDIT - Reinstalled it and now it seems to be working. Are there any plans for mass indexing? or is this generally upload a few files at a time and then navigate from there.
Also seem to be having session issues, if i create a 2nd session, and try to name it, it says "Error naming session, session not found"
When I monitor my GPU I'm seeing no activity at all, but get this same error.
@gschleusner can you confirm if that model is actually being loaded on the GPU (you can add the device to the logs here. You can check the app.log for logs. It might be that the environment is not seeing your GPU for some reasons.
@zono50 thanks for reporting the bugs. I will look at the renaming issue. We can potentially implement a api for indexing a large number of documents.
I couldn't seem to get the qwen 2B to work, could you implement that as one of the models that comes with your program?
@zono50 thanks for reporting the bugs. I will look at the renaming issue. We can potentially implement a api for indexing a large number of documents.
First of all, well done; secondly, in addition to the renaming I encountered an issue with the delete session - clicking the button doesn't do anything.
one question about this, should i download the model or byaldi will download it automatically?
@metantonio byaldi will automatically download that for you.
Yeah, i have 12 GB of Vram, trying to use the 7b qwen vision model, and it keeps giving me this error message, "An error occurred while generating the response: You can't move a model that has some modules offloaded to cpu or disk". I never see my VRAM go over 35%, so not sure if it's doing an estimated calculation then refusing, or is something else going on here?