mhausenblas / kboom

The Kubernetes scale & soak load tester
Apache License 2.0
305 stars 24 forks source link

Decode json errors #5

Closed lachie83 closed 5 years ago

lachie83 commented 5 years ago

I'm occasionally getting spurious json decode errors when trying to run larger numbers of pods.

kubectl kboom generate --mode=scale:60 --load=pods:1000
<snip>
2019/04/27 04:10:35 Can't create pod scale-sleeper-936: decode error status 429: decode json: invalid character 'T' looking for beginning of value
2019/04/27 04:10:35 Can't create pod scale-sleeper-1176: decode error status 429: decode json: invalid character 'T' looking for beginning of value
2019/04/27 04:10:35 Can't create pod scale-sleeper-767: decode error status 429: decode json: invalid character 'T' looking for beginning of value
2019/04/27 04:10:35 Can't create pod scale-sleeper-275: decode error status 429: decode json: invalid character 'T' looking for beginning of value
2019/04/27 04:10:35 Can't create pod scale-sleeper-896: decode error status 429: decode json: invalid character 'T' looking for beginning of value
2019/04/27 04:10:35 Can't create pod scale-sleeper-160: decode error status 429: decode json: invalid character 'T' looking for beginning of value
2019/04/27 04:10:35 Can't create pod scale-sleeper-376: decode error status 429: decode json: invalid character 'T' looking for beginning of value
2019/04/27 04:10:35 Can't create pod scale-sleeper-130: decode error status 429: decode json: invalid character 'T' looking for beginning of value
2019/04/27 04:10:35 Can't create pod scale-sleeper-921: decode error status 429: decode json: invalid character 'T' looking for beginning of value
2019/04/27 04:10:35 Can't create pod scale-sleeper-403: decode error status 429: decode json: invalid character 'T' looking for beginning of value
<snip>
mhausenblas commented 5 years ago

Thanks a lot for reporting this @lachie83, much appreciated!

A quick look at the responsible code segment suggests that I'm overloading the API server with requests, hence the HTTP 429 response code and due to that, rather than sending JSON back the API server sends some error message in plain-text that the library tries to decode because it thinks it's actually JSON.

This is not super surprising since I don't have any rate limiting in the code creation part. My hunch (or working hypothesis, if you prefer): adding some short, random delays in the ms range after go pr.launch() should address this issue.

lachie83 commented 5 years ago

This sounds like a reasonable approach. Just out of interest would it be better to make the go routine handle a 429 from api server?

mhausenblas commented 5 years ago

Yeah, handling 429 from API server, in addition, wouldn't hurt. Nevertheless, not hammering the API server like a berserker would prolly not be a bad thing to do, in the first place ;)

mhausenblas commented 5 years ago

Or, even better, three scale testing strategies: berserker (the current implementation), good-citizen (rate limiting), and graceful (handling 429s). WDYT?

lachie83 commented 5 years ago

I agree but it's probably quickest to implement the good-citizen based on your original proposal and ship it.

johscheuer commented 5 years ago

@lachie83 are you working on this? We also see this issue in our test setup (and would like to fix it).

mhausenblas commented 5 years ago

Thanks for the reminder, will try to put a fix together for this over the weekend

mhausenblas commented 5 years ago

@johscheuer given your overall contributions, I mean, if you wanna fix that as well … :)

johscheuer commented 5 years ago

I can take a look next week :)

mhausenblas commented 5 years ago

OK, super @johscheuer … let's see who's faster, LOL ;)