datasette / datasette-extract

Import unstructured data (text and images) into structured tables
Apache License 2.0
128 stars 3 forks source link

Generalize to other models #27

Open cbeauhilton opened 2 months ago

cbeauhilton commented 2 months ago

Any thoughts on expanding this to accept other LLMs (perhaps local ones)? Wouldn't typically have the ability to mess with images as well as unstructured text, but might be a nice approach for e.g. web scraping at scale and structuring locally without having to reach out to OpenAI (or others) to pay for the service.

simonw commented 2 months ago

I'm absolutely interested in doing this - first to the Claude family, but I'd like to set it up so anything that works with LLM can be used by this tool as well.

Local models are going to be a bit trickier because there aren't many that support OpenAI-style function calling - but there are a few of those now I think, and the models that don't directly support it can usually get there with a bit of extra prompting.