betrusted-io / xous-core

The Xous microkernel
Apache License 2.0
532 stars 85 forks source link

wf200 fails under vegeta stress testing #155

Closed bunnie closed 2 years ago

bunnie commented 2 years ago

Use Vegeta:

echo "GET http://192.168.1.157/" | vegeta attack -duration=1m | tee results.bin | vegeta report with net server running

Requests      [total, rate, throughput]         3000, 50.02, 0.00
Duration      [total, attack, wait]             1m30s, 59.979s, 30s
Latencies     [min, mean, 50, 90, 95, 99, max]  19.946ms, 14.599s, 782.706ms, 30.002s, 30.002s, 30.006s, 30.017s
Bytes In      [total, mean]                     0, 0.00
Bytes Out     [total, mean]                     0, 0.00
Success       [ratio]                           0.00%
Status Codes  [code:count]                      0:3000  
Error Set:
Get "http://192.168.1.157/": dial tcp 0.0.0.0:0->192.168.1.157:80: connect: connection refused
Get "http://192.168.1.157/": context deadline exceeded (Client.Timeout exceeded while awaiting headers)

should result in WfxErr from WF200 + EC reboot

bunnie commented 2 years ago

may want to try running this against a wf200 dev board attached to a raspi to see if the wf200 itself can even hold up against this

bunnie commented 2 years ago

Well, it holds up for a couple seconds at least now:

Requests      [total, rate, throughput]         100, 50.52, 3.23
Duration      [total, attack, wait]             12.085s, 1.979s, 10.106s
Latencies     [min, mean, 50, 90, 95, 99, max]  128.927ms, 6.069s, 5.505s, 9.612s, 10.17s, 10.729s, 10.776s
Bytes In      [total, mean]                     2028, 20.28
Bytes Out     [total, mean]                     0, 0.00
Success       [ratio]                           39.00%
Status Codes  [code:count]                      0:61  200:39
Error Set:
Get "http://10.0.245.184": dial tcp 0.0.0.0:0->10.0.245.184:80: connect: connection refused

longer runs actually take out the wifi router, and I have to reboot it, so there's multiple weak spots in this chain.

More importantly, the log messages seem to show that by and large the backoff and failsafe mechanisms are working. Eventually things topple over because enough server queues get backed up that something breaks but you have to get a run to go at it stably for about a minute for that to happen. i think it's fine?