Upgrade `gpt-4-32k` to GPT-4-Turbo (128K context length) for cost reduction

astronomer / ask-astro

An end-to-end LLM reference implementation providing a Q&A interface for Airflow and Astronomer

https://ask.astronomer.io/

Apache License 2.0

200 stars 47 forks source link

Closed davidgxue closed 8 months ago

davidgxue commented 8 months ago

it is 6 times cheaper to use gpt-4 turbo the newer model with 128k context length than gpt-4-32k
it is also more performant
Changed model temperature as well since high temperature does not make sense. We do not need extract entropy or "creativity" in the response for RAG apps

closes #314