Our front-end work for handling LLM Topics/Tags is currently blocked on updating the back end's data model to support the new field. This PR updates our Bill model to add these fields and unblock the front-end work. This doesn't cover actually scraping the new tags for all bills - that is currently semi-blocked on getting the python dev server setup.
Summary
Adding topics to bills to prep for LLM data
topics is an optional field for now. This is partly due to the existing dataset not having the tags added (which should be remedied relatively soon), and partly to cover ourselves if there's a gap between a bill being scraped and tags being generated by the LLM code (it's currently unclear how much of a delay there will be in the topic/tag generation - depending on an API choice on the LLM side, we may be able to make this mandatory after the initial backfill).
Hard-coding the categories and category -> topic map because that is frozen as a product decision
For context, we have decided on a fixed set of topics that the ML code will assign to bills - and we have bucketed those topics into a manually generated, fixed list of topic categories. We've previously been calling these tags, but I feel like topic is a more precise term for use in the data models.
Updating search indexing for bills to hydrate topic categories and properly format the bill topics so they can be used in the front-end by the Instantsearch hierarchical menu widget
Problem
Our front-end work for handling LLM Topics/Tags is currently blocked on updating the back end's data model to support the new field. This PR updates our Bill model to add these fields and unblock the front-end work. This doesn't cover actually scraping the new tags for all bills - that is currently semi-blocked on getting the python dev server setup.
Summary
topics
is an optional field for now. This is partly due to the existing dataset not having the tags added (which should be remedied relatively soon), and partly to cover ourselves if there's a gap between a bill being scraped and tags being generated by the LLM code (it's currently unclear how much of a delay there will be in the topic/tag generation - depending on an API choice on the LLM side, we may be able to make this mandatory after the initial backfill).topic
is a more precise term for use in the data models.Checklist
Screenshots
N/A
Known issues
N/A
Steps to test/reproduce