meltano / sdk

Write 70% less code by using the SDK to build custom extractors and loaders that adhere to the Singer standard: https://sdk.meltano.com
https://sdk.meltano.com
Apache License 2.0
97 stars 69 forks source link

Cache requests with `requests-cache` to avoid hitting rate limits in dev and CI #236

Open MeltyBot opened 3 years ago

MeltyBot commented 3 years ago

Migrated from GitLab: https://gitlab.com/meltano/sdk/-/issues/237

Originally created by @edgarrmondragon on 2021-10-12 19:27:52


Summary

For HTTP (RESTful or otherwise) taps, it would be good to allow users to cache responses. A nice option to implement this behavior could be https://github.com/reclosedev/requests-cache.

Proposed benefits

There's at least three use cases where users and developers of Singer taps would like to save a cache of requests made to the source API.

Tap Development

During initial development of an HTTP tap, it would be good to have responses cached when testing and iterating on state, pagination and child streams. This would speed up development just by making the dev not wait for responses every time, and also save on API request limits.

CI

Arguably, CI pipelines should be lean and not overwhelm external services given and rate limits cause noisy CI failures. A tap repo would include a filesystem requests cache (https://requests-cache.readthedocs.io/en/stable/modules/requests_cache.backends.filesystem.html) and PRs that updates or adds requests parameters, should include the corresponding cache file update.

Integration with Meltano

Similar to tap development, users might be interested in not overwhelming external services or consuming request quotas during initial integration with Meltano and a target.

Proposal details

Best reasons not to build

I can imagine a scenario where users are unknowingly caching requests in a production setting. If we can't provide safeguards against that, we shouldn't build this feature.

MeltyBot commented 2 years ago

View 7 previous comments from the original issue on GitLab

stale[bot] commented 1 year ago

This has been marked as stale because it is unassigned, and has not had recent activity. It will be closed after 21 days if no further activity occurs. If this should never go stale, please add the evergreen label, or request that it be added.