proddy commented 11 months ago

More of a discussion than an project issue,

One thing I'm planning to play with is using Brotli instead of Gzip on HTTPS. It's apparently faster and would be great to use this library's performance tests to benchmark. I'm hoping it's as simple as just adding response.addHeader("Content-Encoding", "br") and compressing the files using zlib.brotliCompressSync() instead of zlib.gzipSync(().

I'll report back with some test results.

zekageri commented 11 months ago

Well, my files already gzipped with webpack but it is interesting for sure!

proddy commented 11 months ago

i have a huge amount of web pages (220KB), all stored in Flash mem, with 60+ endpoints. That's mainly because my app is translated to 8 languages. Brotli's compression is better so anything that saves on precious ESP32 mem is wonderful.

zekageri commented 11 months ago

But do you compress it on the fly?

I also have a ridiculis amount of endpoints and my pages are also huge. I did not store separate pages for languages but i choosen a different approach. I have language json files for different pages stored in folders/page. This lets me store a single page and multiple lang files which i belive is a lot more resource friendly. I can also edit any lang file without modifying the html.

zekageri commented 11 months ago

Btw my app is translated to 4 languages.

proddy commented 11 months ago

I use dynamic translations on the web side to 10 languages using typesafe-i18n. The backend code is also translated using a custom library, all in real-time. The web code is all reactJS/typescript with yarn/vite bundling. You can see a live demo at https://ems-esp.derbyshire.nl, albeit an older version.

I've been using AsyncTCP and AsyncWebServer for years and always wanted to move away from Arduino and go to native IDF. Using PshchicHttp is the first step and super excited. The MQTT part has already been migrated to espMqttClient.

zekageri commented 11 months ago

Wow. So i assume it is an spa and thats why it is so fast. Does this demo run on an esp?

proddy commented 11 months ago

not on an ESP! I'm cheating and the demo is hosted on Cloudflare...using CF Pages for the web and CF Workers for the API backend which mimics the data from an ESP32. That's why it's so quick. Cloudflare is excellent, and free too.

If you're interested the code is in https://github.com/emsesp/EMS-ESP32. The PsyhicHttp port is in the https_36 branch which I'll upload in the next few days. Almost finished the port.

proddy commented 11 months ago

@hoeken I'd like to run some benchmarking too. Do you have automated way I can steal that converts all the log data from the autocannon scripts into a single /benchmark/comparison.ods file?

hoeken commented 11 months ago

No I don't unfortunately. I did the last ones manually. A script for that would be great. Writing to a .csv would be enough.

On Fri, Dec 29, 2023, 09:21 Proddy @.***> wrote:

@hoeken https://github.com/hoeken I'd like to run some benchmarking too. Do you have automated way I can steal that converts all the log data from the autocannon scripts into a single /benchmark/comparison.ods file?

— Reply to this email directly, view it on GitHub https://github.com/hoeken/PsychicHttp/issues/37#issuecomment-1872126538, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABEHSRH4ADYVTMAFMZL2BLYL3GWXAVCNFSM6AAAAABBBRP5SCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQNZSGEZDMNJTHA . You are receiving this because you were mentioned.Message ID: @.***>

proddy commented 11 months ago

I'm doing some A/B testing with two identical ESP32's, one running the same code with AsyncWebServer and the other PsyhicHttp. Then I used autocannon-ui (which has autocannon and a compare function integrated) to benchmark the two.

With autocannon I used a single connection, 1 worker/pipeline and 10 second duration (which are the defaults) to a single URI endpoint /api/system/command which returns a 260 byte JSON object.

Results:

1 = AsyncWebServer

2 = PsychicHttp

Initial observations:

Psychic seems to be 2-3x slower when returning JSON from a GET. The code is very simple and follows the examples in this repo using PsychicJsonResponse. I also notice this when typing in the URII into a browser, and analyzing the network response and timings via the browsers dev tools. Here the "waiting for server response" is much higher with PsychicHtttp.
During the load test, PsychicHttp makes 62 calls and returns Status Code 200 62 times. AsyncWebServer makes one call and returning a 200. I think this is because AsyncWebServer is closing on every response:

there is no close with PsychicHttp.

I'll keep digging, but any ideas/thoughts welcome.....

hoeken commented 11 months ago

@proddy great to see some more benchmarking. I have done pretty much zero work on optimization, so hopefully we have plenty of room to speed things up. That being said, I don't really know the toolchains, etc for doing code profiling so I would love some help. I would gladly accept PRs as well :)

One thing I noticed is that the Async test basically just makes 1 request and quits. The http headers + 260b response matches up with a total of 387b transferred. That is probably skewing things a bit. Is there a setting in autocannon to have it reconnect?

As for potential optimizations, it might be the url matching? you could set server.config.uri_match_fn = NULL; which would switch back to the basic strcmp() url matching. Its set to wildcard matching right now.

There's also probably room for optimization around the endpoint / client / handler "arrays". Right now its using std::list as it made things very easy to implement, but it seems its not the fastest option out there.

proddy commented 11 months ago

Thanks @hoeken for the response (no pun intended!). I know these are busy days.

I played with uri_match_fn, and forcefully closing each connection as AsyncWS does but it makes no difference. The response is still twice as slow, also for static HTML. The list of handler array code is fine and fast enough. I've done a lot of work with queues and home-built linked lists and always go back to std::list as it's just very solid and quick.

To rule out any quirks in my code I'll use your examples and run some benchmarking to see if the results are similar.

Also curious if others are seeing anything similar?

hoeken commented 11 months ago

Puns are definitely allowed, especially since the project name is a play on ESP. :)

If you didn't notice, there is code in the /benchmark directory for both psychichttp and espasync (and arduinomongoose). If you want to dig into the benchmark stuff or test your own code feel free to add to that, just try to keep code identical between the different sketches.

hoeken commented 11 months ago

I have this code in the wifi setup on the benchmarks, maybe see if that has any effect?

  WiFi.setSleep(false);
  WiFi.useStaticBuffers(true);

proddy commented 11 months ago

LOL, I had to google that. https://en.wikipedia.org/wiki/Psychic

I have seen your benchmark code for the various libs and will extend that and run some more stress tests. I really want PsychicHttp to knock AsyncWebServer and others out of the park.

zekageri commented 11 months ago

Iam more of an observer right now because of the holidays but my webserver definietly feels slower and sometimes i dont get all the files loaded on the front end side. I played with the config object without success

Chris--A commented 11 months ago

With regards to the PsychicJsonResponse the latency might be attributed to the pre calculation of the output size: https://github.com/hoeken/PsychicHttp/blob/26b49f73dca48fa333361fb4a4cfeebca802eeda/src/PsychicJson.cpp#L39

This is to determine if it can be sent in a single go, or chunked. It might be worth removing the length check and testing with chunked send only. If the output ends up being smaller than the internal buffer, there is only one network send done anyway.

I'm testing other changes that may improve the PsychicStreamResponse I submitted a PR for (#45). The changes could potentially speed up PsychicJsonResponse also.

Additionally, the stream response will allow you do a chunked only send, with no length check, and a static JSON buffer:

server.on("/api/*", [](PsychicRequest *request) {

  StaticJsonDocument<512> doc;
  doc["success"] = true;

  if(request->url().endsWith("system")){
    doc["FreeHeap"] = ESP.getFreeHeap();
    doc["MinFreeHeap"] = ESP.getMinFreeHeap();
    doc["MaxAllocHeap"] = ESP.getMaxAllocHeap();
    doc["HeapSize"] = ESP.getHeapSize();
  }else{
    doc["success"] = false;
  }

  PsychicStreamResponse response(request, "application/json");
  response.beginSend();
  serializeJson(doc, response);
  return response.endSend();
});

proddy commented 11 months ago

Thanks @Chris--A - I'll look into optimizing Json using your PR.

I used the benchmark code to performance test PsyhicHttp against my heavily tweaked versions of AsyncWebServer and AsyncTCP and the test JsonResponse call to /api?foo=bar and also alien.png is 50-70% slower with PsychicHttp. Maybe I'm not comparing apples with apples here. I need to dig deeper into the IDF code.

--edit--

forcing thunking didn't make a difference. I'm starting to think the performance hit is not in PsychicHttp but in the async TCP lwip stuff.

hoeken / PsychicHttp

Performance comparison (plus Brotli vs Gzip compression) #37

1 = AsyncWebServer

2 = PsychicHttp