ucbepic / docetl

A system for agentic LLM-powered data processing and ETL
https://docetl.org
MIT License
1.26k stars 114 forks source link

Fix: Default max comparison pairs in resolve.py #147

Open staru09 opened 1 week ago

staru09 commented 1 week ago

This PR closes #130 I have added a custom rate limit for different models which can be edited by the user. The OpenAI limits need to be updated though.

Sample usage is as follows:

pipeline = Pipeline(
    name="resolution_pipeline",
    datasets={...},
    operations=[...],
    steps=[...],
    output=...,
    rate_limits={
        "claude-3.5-sonnet": 300,  # Custom lower limit
        "gpt-4o": 800,
        "my-custom-model": 100
    }
)

# Check different models
models = ["claude-3-sonnet", "gpt-4o", "my-custom-model", "unknown-model"]
for model in models:
    limit_info = pipeline.get_rate_limits(model)
    print(f"{model}: {limit_info}")

# Output:
claude-3-sonnet: {'requests_per_minute': 300, 'source': 'custom'}
gpt-4o: {'requests_per_minute': 800, 'source': 'custom'}
my-custom-model: {'requests_per_minute': 100, 'source': 'custom'}
unknown-model: {'requests_per_minute': 200, 'source': 'fallback'}
shreyashankar commented 1 week ago

Discussing offline in Discord