astronomer / ask-astro

An end-to-end LLM reference implementation providing a Q&A interface for Airflow and Astronomer
https://ask.astronomer.io/
Apache License 2.0
192 stars 47 forks source link

extract_astro_blogs should check for date before extract #116

Closed mpgreg closed 9 months ago

mpgreg commented 10 months ago

https://github.com/astronomer/ask-astro/blob/c45487c7f12a9424dbe885580c687e35e30b7de4/airflow/include/tasks/extract/blogs.py#L37-L53

extract_astro_blogs() currently (inefficiently) reads all blogs and then drops any older than blog_cutoff_date. Instead it should check for date before extract.