ucbepic / docetl

A system for agentic LLM-powered data processing and ETL
https://docetl.org
MIT License
1.26k stars 114 forks source link

Learned filter; learned compare (for resolve and join) #187

Open shreyashankar opened 2 days ago

shreyashankar commented 2 days ago

In the optimizer; we should be able to fit a logistic regression model to the embeddings to try to learn the binary classification function. If we learn this with good accuracy; we could replace the LLM call with the logistic regression model---or we could implement a model cascade, if the logistic regression model is well-calibrated, we can fall-back to the LLM call if the probability is uncertain.