astronomer / ask-astro

An end-to-end LLM reference implementation providing a Q&A interface for Airflow and Astronomer
https://ask.astronomer.io/
Apache License 2.0
196 stars 47 forks source link

Improve & Address Bugs in`test_retrieval` the Batch Test Question DAG #298

Closed davidgxue closed 8 months ago

davidgxue commented 9 months ago

Bug

Describe the bug

To Reproduce Steps to reproduce the behavior:

  1. Have proper configuration of environment variables for the test_retrieval DAG
  2. Trigger the DAG
  3. Put a list of subset question ids in the parameter prompt, such as [1,2,3]
  4. Errors out during DAG run

Expected behavior No errors

Improvements

  1. The references saved in the csv are in random incorrect order. This is probably related to the fact that it is put into a set using {} somewhere.
  2. The multi-query references and the weaviate search references are not relevant. They don't provide useful info but delays the pipeline and incurs cost.