Retrieval QA Demo Small Fixes

went through retrieval QA demo using multiple LLMs including
- "HuggingFaceH4/zephyr-7b-beta"
- "meta-llama/Llama-2-7b-chat-hf"
- "mistralai/Mistral-7B-v0.1"
- "databricks/dolly-v2-3b"
upgrade langchain to overcome the bug:

Inference initialization failed: The model has been loaded with accelerate and therefore cannot be moved to a specific device. Please discard the device argument when creating your pipeline object.`

CambioML / pykoi-rlhf-finetuned-transformers