jonny-rimek / wowmate

combatlog analysis for world of warcraft players
http://wowmate.io
MIT License
4 stars 0 forks source link

Another round of loadtesting, upload 15k m+ logs with 1 dungeon, last infra cost was ~230€ #198

Closed jonny-rimek closed 3 years ago

jonny-rimek commented 3 years ago

Completed another round of load testing.

the only problem was query throttling in timestream, everything else worked perfectly. cost is not yet updated.

Screenshot_2021-03-21 CloudWatch Dashboard Sharing Screenshot_2021-03-21 CloudWatch Dashboard Sharing(1) Screenshot_2021-03-21 AWS X-Ray(2) Screenshot_2021-03-21 AWS X-Ray(1)

jonny-rimek commented 3 years ago

cost was 90€ down from 230, mainly cus of decreased lambda size, somehow I had 32GB more logs, mustve left debug on somewhere^^

jonny-rimek commented 3 years ago

Screenshot_2021-03-22 CloudWatch Management Console Was only 1 function that had debug on, and according to log group no where near the ~40GB. should keep an eye on that

jonny-rimek commented 3 years ago

Further reduced max concurrent invocations to 10convert lambdas and 2x 5 get summary lambdas, no more complete failure. all throttles are retried and resolved evantually Screenshot_2021-03-22 CloudWatch Dashboard Sharing

jonny-rimek commented 3 years ago

The support increased the timestream query limit and I did another round of load tests at 200concurrency for convert, and 2x 100 for the query lambdas. The results are very promising, I got a few throttles, but overall no messages in dlq.

I only sent ~7,5k events instead of the 15k I usually do.

As a side effect of the increased limit the query lambdas are also faster

Screenshot_2021-04-02 CloudWatch Management Console Screenshot_2021-04-02 CloudWatch Management Console(1) Screenshot_2021-04-02 AWS X-Ray(1) Screenshot_2021-04-02 AWS X-Ray

jonny-rimek commented 3 years ago

the write cost of the old load test with 7,5k event 22€ with 44,6gb written

jonny-rimek commented 3 years ago

7,5k events - common attributes

jonny-rimek commented 3 years ago

atm it looks like the the write cost of the load test was ~7€ which would be around 70% saving compared to before the common attribute. for some reason i have 550GB data scanned by queries, but each query is only 10MB, would be around 55k queries, seems too much. need to keep an eye on it.

the rest of the bill is reasonable, lambda is +4€ but part of it was in the free tier irrc

jonny-rimek commented 3 years ago

Screenshot_2021-04-12 Billing Management Console

jonny-rimek commented 3 years ago

Another round of loadtesting. I'm now saving all combat logs and upload a 430MB file with 4 logs (1 deplete). The file is processed 1300 times. Important note: I doubled the convert lambda memory, but the writes are now concurrent, hence it's way faster at the same time the file is 10x bigger, which results in the same duration as before

This is the cost distribution before. Screenshot_2021-04-15 Billing Management Console-PRE-LOAD-TEST

metrics:

Screenshot_2021-04-15 CloudWatch Dashboard Sharing Screenshot_2021-04-15 CloudWatch Dashboard Sharing(1)

xray Screenshot_2021-04-15 AWS X-Ray

result: Screenshot_2021-04-16 Billing Management Console(1)

unfortunately my first screenshot didn't capture timestream cost, the most important item :s lambda cost was around 2€

jonny-rimek commented 3 years ago

moved spell and caster id both to common attribute to reduce write cost in timestream

result:

Screenshot_2021-04-18 Billing Management Console Screenshot_2021-04-19 Billing Management Console

jonny-rimek commented 3 years ago

I'm still waiting on my limit increase for timestream in the new accounts and i'll do the last load test once I refactored the upload process to a step function

jonny-rimek commented 3 years ago

the limit was increased in the prod account, another test after sfn refactor