mediacloud / news-search-api

Internal API server that offers search access to the Media Cloud Online News Archive (in Elasticsearch).
https://mediacloud.org
GNU Affero General Public License v3.0
1 stars 3 forks source link

code cleanup tasks #71

Closed rahulbot closed 1 month ago

rahulbot commented 2 months ago

This code has seen a few different authors at this point, and been forked and reassesed with some changing assumptions. I think we need to revisit this code to clean it up in a few ways:

pgulley commented 1 month ago

Realizing that we compute aggregations for "significant" and "rare" terms as well as for "top"- my instinct is that we could excise them, as we never use them on the frontend

rahulbot commented 1 month ago

Yeah - we included those early to experiment with different results. The "top" seems most useful and stable, so perhaps we should just commit to those.

pgulley commented 1 month ago

Also, the api.py file is really bloated- I'm thinking it would be a worthwhile improvement in the developer experience if the the fastapi views and the actual elasticsearch logic were separated into two files

pgulley commented 1 month ago

Feeling like this is probably ready to close after the most recent round of polish I did