serverless / dashboard

MIT License
26 stars 10 forks source link

[Python Lambda SDK] Do not sample out infrequent usage and sample at 20% #801

Closed selcukcihan closed 1 year ago

selcukcihan commented 1 year ago

Description

Testing done

Unit/integration tested

selcukcihan commented 1 year ago

@medikoo ready for review 🙏

medikoo commented 1 year ago

@selcukcihan it appears that integreation tests failed after this update: https://github.com/serverless/console/actions/runs/5264703180

selcukcihan commented 1 year ago

@selcukcihan it appears that integreation tests failed after this update: https://github.com/serverless/console/actions/runs/5264703180

Thanks for letting me know, it's a timeout issue (the tests have 120 second total timeout and that was exceeded), I've looked at the action logs to see if I can spot the problem but couldn't come up with any explanation. I'm now trying to reproduce it on my AWS account to see what is causing the timeout. I'll post updates. I suspect it might be a glitch while setting up infra, like waiting for the api to be published or something like that (it's the http api test).

medikoo commented 1 year ago

@selcukcihan thanks for looking into that, indeed it could have been glitch on AWS side

selcukcihan commented 1 year ago

I couldn't reproduce it on my account, however looking closer to the logs from the failed action, I can see that the test started at

2023-06-14T08:17:29.2327821Z Python: integration

and the function within the test that took too long was first invoked about 45 seconds later

2023-06-14T08:18:06.277Z i test Invoke function #1 api_endpoint-http-api-v1

And there are no other invocation logs (ie. there is no test Invoke function #2 api_endpoint-http-api-v1)

That implies either of two cases, with the first one being a strong candidate:

  1. api gateway somehow returned 404 and it kept on retrying (maybe the default stage deployment has not finished yet at that point) https://github.com/serverless/console/blob/516f32d610d029e270d0563ac74196adc299ab2e/node/test/python/aws-lambda-sdk/integration.test.js#L803
  2. request to api gateway somehow took too long, I know API Gateway has a 30 second timeout, so it might be a client-side issue like a network problem when reading back the response.

I think there is not much we can do here, except adding a log line in the 404 retry case to ensure when this happens again we know 100% that it's the first case described above that's causing us trouble. What do you think @medikoo ?

medikoo commented 1 year ago

I think there is not much we can do here, except adding a log line in the 404 retry case to ensure when this happens again we know 100% that it's the first case described above that's causing us trouble. What do you think @medikoo ?

Adding debug lines that will allow us to have better certainty on what happened is always a good idea