DataDog / datadogpy

The Datadog Python library
https://datadoghq.com/
Other
612 stars 304 forks source link

datadog.api.Downtime.get_all() throws ValueError on 429 Response #733

Open aaronsewall opened 2 years ago

aaronsewall commented 2 years ago

Describe the bug When calling datadog.api.Downtime.get_all() repeatedly, after some time library will throw 429 errors but represent them as ValueErrors when it can't parse the html that is returned instead of json.

ValueError is caught and re-thrown around datadog/api/api_client.py:200

e.g. ValueError: Invalid JSON response: b'\n\n\n<!DOCTYPE html>\n<!-- ...

To Reproduce Steps to reproduce the behavior:

  1. Call datadog.api.Downtime.get_all() repeatedly.

Expected behavior Something changed server side since this did not used to happen. This should use the backoff mechanics that are already present in the library.

Environment and Versions: A clear and precise description of your setup:

thundergolfer commented 2 years ago

As extra context, the HTML returned is for a 429

image
github-actions[bot] commented 2 years ago

Thanks for your contribution!

This issue has been automatically marked as stale because it has not had activity in the last 30 days. Note that the issue will not be automatically closed, but this notification will remind us to investigate why there's been inactivity. Thank you for participating in the Datadog open source community.

If you would like this issue to remain open:

  1. Verify that you can still reproduce the issue in the latest version of this project.

  2. Comment that the issue is still reproducible and include updated details requested in the issue template.

thundergolfer commented 2 years ago

Still an issue.

Jaakkonen commented 1 year ago

For v2 endpoint https://api.datadoghq.com/api/v2/logs/events/search the ratelimit seems to be 300 reqs/hour. More information about the ratelimiting status is sent in the headers documented at https://docs.datadoghq.com/api/latest/rate-limits/.

So a back-off is definitely possible to implement for the get_all function but it'd possibly take hours to complete. Before resorting to that one should ensure that the amount of calls is minimized by for example having the limit be the maximum 5000 events.

shaneikennedy commented 4 months ago

Still an issue, seeing it with api.Metric.query(...)