fecgov / openFEC

The first RESTful API for the Federal Election Commission. We're aiming to make campaign finance more accessible for journalists, academics, developers, and other transparency seekers.
https://api.open.fec.gov/developers
Other
480 stars 106 forks source link

Feature request: Remove or increase download cap, restrict pagination on large datasets #5884

Open lbeaufort opened 4 months ago

lbeaufort commented 4 months ago

Issue

When paginating through millions of records, it can take several minutes to retrieve just 100 records at a time. This inefficiency prevents users from accessing the data they need promptly and results in expensive queries being run repeatedly.

Proposed solution

To improve this process, we propose either removing or increasing the download cap and restricting pagination for datasets larger than 500k or 1 million records. This change would allow users to queue up a download for large datasets, eliminating the need to paginate through all the data.

Action item(s)

Completion criteria

(What does the end state look like - as long as this task(s) is done, this work is complete)

References/resources/technical considerations

(Is there sample code or a screenshot you can include to highlight a particular issue? Here is where you reinforce why this work is important)

cnlucas commented 3 months ago

https://github.com/fecgov/openFEC/issues/1378 Background (was set to 100k and was raised in https://github.com/fecgov/openFEC/pull/2584)

patphongs commented 3 months ago

increasing the download cap and restricting pagination for datasets larger than 500k or 1 million records

Notes from 7/23/2024 discussion

What are meaningful indicators of expensive queries?

Questions?