codeforboston / maple

MAPLE makes it easy for anyone to view and submit testimony to the Massachusetts Legislature about the bills that will shape our future.
https://mapletestimony.org
MIT License
38 stars 106 forks source link

WIP - MAPLE MGL API #1549

Open Mephistic opened 3 weeks ago

Mephistic commented 3 weeks ago

Quick writeup based on our hack night discussion on 4/23 - this needs more refinement before we're ready to go, this is mostly a braindump.

Problem

Our ML experts want to be able to reference the Massachusetts General Laws (MGL) to hydrate data for our upcoming LLM-driven bill summaries. If a bill references any sections of the MGL, the model will need to fetch those sections in order to generate the prompt used to generate the summary (to ensure we aren't missing critical context).

They are currently doing so by scraping the HTML of the MGL website, but we want a less fragile solution going forward.

Proposal

At the beginning of a session, let's scrape the MGL, store text on our side, and expose it to our LLM wrapper service via a private API. We'll do this instead of relying on the MA Legislature API at runtime because <???> (missing context, are we looking for versioning on this or just speed)?

Success Criteria

Open Questions

Quick diagram of my understanding from our hack night discussion on 4/23:

Screenshot 2024-04-23 at 9.09.31 PM.png