srcfl / srcful-gateway

Sourceful Energy Gateway: Connect your solar inverter or Utility meters to earn tokens.
https://sourceful.energy/
MIT License
5 stars 2 forks source link

Stress testing #88

Open h0bb3 opened 10 months ago

h0bb3 commented 10 months ago

We could do some automated stress/performance testing of the REST API and possibly also the BLE service.

This could possibly be further developed to checking if harvests are missed/delayed during API stress.

h0bb3 commented 10 months ago

done some initial tests here and it is actually quite interesting.

100 sequential calls to an API endpoint will take approximately 100 seconds as we process one API call message per second. Result was 99.14073038101196 - Think the less than 100 depends on luck, i.e. when you get the first message.

This was tested with the hello endpoint on a local network - so really minimal overhead.

However, the story is different considering parallel requests. Then basically all requests will be done at the same time as the threads are blocked when the request is done (waiting for their 1 second slot in the queue). This means a number of sockets opened at the same time and a big queue of requests. Threads will start to time out and also be abruptly disconnected. I guess there is some logic for this in the underlying layers.

The server logs look pretty fine and the server itself ticks along nicely - this was basically the desired behavior. API requests should not disturb harvesting and definitely not future control signals.

The problem is then of course for clients eg the ble-service and others - they cannot expect timely responses and an mechanisms to remedy this should be put in place. I.e. allowing for ample timeouts and retrying after sleeping.

It would also be interesting to test more intricate scenarios like reading from the inverter and see if this hinders harvesting, etc.

Threading could be used to a greater extent.. but then we lose control over when things are executed - but maybe that is worth it...

h0bb3 commented 10 months ago

It is also not easy to know what is really happening inside the server except for looking at the logs (while the test runs) to spot anomalies.

Maybe when we get the modbus logging API up and running we can get more insights.

h0bb3 commented 10 months ago

pytest can be used for benchmarking but I'm not sure this is useful in our case: https://pytest-with-eric.com/pytest-best-practices/pytest-benchmark/#Test-Example-Code-Basic-Benchmark-Ex

h0bb3 commented 8 months ago

fixed and ran the stress tests again. A bit messy to run these but actually found a potential bug in that harvest.backoff time could become greater than max_backoff_time if the elapsed time is large. Fixed this and added specific unit test for this.