PromtEngineer / localGPT

Chat with your documents on your local device using GPT models. No data leaves your device and 100% private.
Apache License 2.0
20.06k stars 2.24k forks source link

blank answer to every question with assitant model #371

Open akupka opened 1 year ago

akupka commented 1 year ago

Hello,

localgpt ist working for me with chat gptq models. But we have a lot of german pdf files and we need therefore for example this model https://huggingface.co/TheBloke/Llama-2-13B-German-Assistant-v4-GPTQ This model is a assistant model with this type of prompt

Prompt template: User-Assistant-Hashes
### User: {prompt}
### Assistant:

This model https://huggingface.co/TheBloke/Llama-2-13B-German-Assistant-v4-GPTQ is working with the example code from the model card Any idea how to change the prompt template in run_localGPT.py?

PromtEngineer commented 1 year ago

@akupka try to see which source documents are being returned. That will help you in debugging whether the issue is coming from the embedding based retrieval or by the LLM.

akupka commented 1 year ago

ok, I print the "res"

`Enter a query: was ist eine AFO

res {'query': 'was ist eine AFO', 'result': '', 'source_documents': [Document(page_content='Zur Erläuterung der Afo [A_15208-01]: \n\nDer Anbieter muss die Anzahl seiner Nutzer kennen und sein System mindestens so \ndimensionieren, dass die Lastvorgaben eingehalten werden. \nBeispielrechnung: Für 12,57 Mio. Nutzer (etwa 17,95% Marktanteil) muss für die \nOperation "I_Authentication_Insurant:login" eine Lastvorgabe von mindestens 11 \nAnfragen pro Sekunde eingehalten werden (17,95% von 60 Anfragen pro Sekunde). \n\nTabelle 66: Tab_ePA_Aktensystem - Last- und Bearbeitungszeitvorgaben-01 \n\nSchnittstellenoperationen \n\nLastvorgaben \n\nBearbeitungszeitvorgaben \n\nSpitzenlast \n[1/sec] \n\nMittelwert \n[msec] \n\n99%-Quantil \n[msec] \n\nI_Authentication_Insurant \n\n login \n\nI_Authorization \n\n getAuthorizationKey \n\nI_Authorization_Management \n\n putAuthorizationKey \n\n checkRecordExists \n\nI_Document_Management_Connect \n\n openContext \n\nI_Document_Management \n\n60 \n\n100 \n\n25 \n\n25 \n\n100 \n\n755 \n\n770 \n\n520 \n\n100 \n\n100 \n\n960 \n\n980 \n\n690 \n\n180 \n\n180 \n\ngemSpec_Perf_V2.25.0 \nVersion: 2.25.0', metadata={'source': '/home/andree/localGPT/SOURCE_DOCUMENTS/gemSpec_Perf_V2.25.0.pdf'})]}

Question: was ist eine AFO

Answer:

Enter a query: `

you see result is empty but it seems to be that the system found something in the documents, which is ok

alexfilothodoros commented 1 year ago

Hi,

I have just found this model and I am trying to use it. What did you define as model_basename?

akupka commented 1 year ago

Hi,

I have just found this model and I am trying to use it. What did you define as model_basename?

Sorry for the late answer I use model.safetensors as basename.

Dafterfly commented 1 year ago

I also started to have this issue recently. I haven't solved it yet but it might be a langchain issue https://github.com/langchain-ai/langchain/issues/11015

Dafterfly commented 1 year ago

I also had this problem, the issue might be with llama-cpp and langchain. After I ran pip install --upgrade llama-cpp-python and then pip install --upgrade langchain in that order. Then I was able to get an answer, but not consistently. Sometimes I get an empty answer and sometimes I get an answer of '?'

Dafterfly commented 1 year ago

I think I might have figured out an answer after trying to get it to work but with an orca model I had. Although the model was a llama model, it didn't properly support the sys and inst tokens. I replaced line 129 https://github.com/PromtEngineer/localGPT/blob/279dfbb45c09b659c87d7c642699b6ab9f483831/run_localGPT.py#L129 with

prompt, memory = get_prompt_template(promptTemplate_type="other", history=use_history)

Maybe we can make this a configurable in constants.py.

Also, the system_prompt in https://github.com/PromtEngineer/localGPT/blob/279dfbb45c09b659c87d7c642699b6ab9f483831/prompt_template_utils.py#L12 is in English. I'm not sure how well it'll work with a German model but you could also try translating it into German.

jakiro2017 commented 1 year ago

I think I might have figured out an answer after trying to get it to work but with an orca model I had. Although the model was a llama model, it didn't properly support the sys and inst tokens. I replaced line 129

https://github.com/PromtEngineer/localGPT/blob/279dfbb45c09b659c87d7c642699b6ab9f483831/run_localGPT.py#L129

with prompt, memory = get_prompt_template(promptTemplate_type="other", history=use_history)

Maybe we can make this a configurable in constants.py.

Also, the system_prompt in

https://github.com/PromtEngineer/localGPT/blob/279dfbb45c09b659c87d7c642699b6ab9f483831/prompt_template_utils.py#L12

is in English. I'm not sure how well it'll work with a German model but you could also try translating it into German.

I need to edit get_prompt_template add support for vicuna for it to work

akup06 commented 1 year ago

Hello, it is not neccesary to change the code, there ist a start parameter with python ./run_localGPT.py --model_type non_llama a model like TheBloke/Llama-2-13B-German-Assistant-v4-GGUF