how can I use for cpu QA model from llama2

llm = CTransformers(model='./llama-2-7b-chat.ggmlv3.q4_K_M.bin', model_type='llama', config={'max_new_tokens': 4096,'temperature': 0.0})

from langchain import PromptTemplate, LLMChain template = """Answer the question based on the contexts below.If the question cannot be answered using the information provided answer with "I don't know". Contexts:{text} Question:What Role does the candidate suits for - 1.Telecaller 2.Medical Coding 3.Software Developer 4.Data Entry? Answer: """ print(template) prompt = PromptTemplate(template=template, input_variables=["text"]) llm_chain = LLMChain(prompt=prompt, llm=llm) print("Running...") text = """

could you give examples around this type of QA.

thisserand / llama2_local

how can I use for cpu QA model from llama2 #10