Open brysn opened 5 months ago
Hi @brysn, sorry to hear about your issues. This issue is hard to reproduce seems it happens sporadically. I tried to do it locally but everything works fine at my end. Can you please confirm if is there any proxy in the middle?, or any other network limitations done by a firewall conf maybe?.
Thanks!
This issue has not recieved a response in 1 week. If you want to keep this issue open, please just leave a comment below and auto-close will be canceled.
Hi @yenfryherrerafeliz , thanks for looking into it. The application is on Heroku; there is no firewall, proxy in the middle, or any other network limitations.
It seems like it has to do with the endpoint discovery caching. Could the endpoint be changing on AWS before the cache expires for the endpoint?
Hi @brysn ,
AFAIK the SDK does not cache any endpoints. The SDK uses Guzzle, and PHP's standard curl client to make network calls. DNS resolution happens at the OS level and is not something that the SDK has visibility / control over.
If you want to root cause this, you might want to use a network diagnostics tool like Wireshark to inspect what kind of networking events are happening your machine. As a temporary measure, you might want to lower / completely disable the TTL on your the DNS cache to see if it solves your issue.
This doesn't seem directly related to the SDK, but we will keep the issue open to see if this helps and maybe provide some other possible guidance.
Thanks, Ran~
@RanVaknin
Thanks for looking into this. The issue is with endpoint discovery and the SDK does indeed cache endpoints: Example
It's unlikely that AWS itself has an endpoint discovery bug. It's more likely that it's an issue with the SDK's endpoint discovery caching.
@RanVaknin since this is a bug with the SDK, can this be looked into further?
@RanVaknin @yenfryherrerafeliz Was there more information I needed to provide or why was this marked as "guidance" when it's a bug with the sdk caching?
Describe the bug
I have a webhook endpoint that consumes messages from a 3rd party and writes records into AWS Timestream. It does so using the
TimestreamWrite/TimestreamWriteClient::writeRecords
method. There are long periods of time in which it works, but intermittently it will start throwing exceptions. When this starts to happen, it seems that ~1/3 of requests fail with this exception, while 2/3 of the requests continue to work.Here are the exception details:
Expected Behavior
According to https://curl.se/libcurl/c/libcurl-errors.html the curl error means:
Could not resolve host. The given remote host was not resolved.
I expected the records to be written to timestream.
Current Behavior
It appears that in some intermittent cases that the endpoint discovery call doesn't work and the hostname
https://ingest.timestream.us-west-2.amazonaws.com
cannot be resolved.Reproduction Steps
Possible Solution
I'm guessing there must be some issue with the curl setup here as I doubt that endpoint on aws is having that many issues.
Additional Information/Context
This happens to hundreds, if not thousands of times per day. It does seem to occur when there are more requests being sent at the same time. For example, overnight there are much fewer requests and everything seems to work fine. But during the day when it is getting flooded with requests then about 1/3 of them fail with this exception.
SDK version used
3.311.0
Environment details (Version of PHP (
php -v
)? OS name and version, etc.)PHP 8.3.7, Ubuntu 22.04