nkolban / esp32-snippets

Sample ESP32 snippets and code fragments
https://leanpub.com/kolban-ESP32
Apache License 2.0
2.35k stars 710 forks source link

HttpServer very slow compared to Mongoose? #286

Open squonk11 opened 6 years ago

squonk11 commented 6 years ago

now I implemented the same webserver functionality once using HttpServer and once using Mongoose. Using this webserver I am reading some amount of data from an industrial device (frequency inverter) using RESTful API (fetch). Now I see that Mongoose seems to be much faster than HttpServer. The average access time for one piece of data is approx. 5 - 20ms for mongoose and approx. 50 - 200ms for HttpServer. I can not believe that HttpServer in general is so slow compared to mongoose - I could imagine that this might be related to some other topic; but I don't have an idea to what. The basic setup in both cases is the same: same device, same webpage, same drivers, ... . Do you have any hints how to accelerate HttpServer. I am really interested in using HttpServer rather than mongoose. Kind regards

nkolban commented 6 years ago

My first question is whether or not you have debug trace enabled? When debug trace is enabled, there is a lot of logging and logging is slow and very serial.

squonk11 commented 6 years ago

In both cases I am compiling with debugging enabled (Optimization Level (Debug(-Og))). But I have logging disabled via SW using: esp_log_level_set("*", ESP_LOG_NONE); On the serial console there are no messages during the measurement. I was thinking if there might be somewhere a problem with granularity of timeslices e.g. of FreeRTOS - but the Tick rate of FreeRTOS is 1000Hz in both cases. Does the HttpServer use some timer or delay which could be reduced?

nkolban commented 6 years ago

I haven't actually looked at tuning the code at all ... my first goal was to get it working. For logging, there are two stories ... the log level selected at runtime and the maximum log level defined at compilation time. In the ESP32 make menuconfig settings we can define a maximum logging level. What this means is that when a logging statement is encountered at compilation time, the build can choose to include it or exclude it. If included, then at runtime the log will still be executed but may then be "ignored" by the run-time logging level. This still has cost.

Beyond that, there are some significant architectural differences. For example, in the current HTTP Server, we have an internal task that blocks waiting on the server socket. When a new connection arrives, that will then be handled. However the internal task uses FreeRTOS tasks with a time slice of 10msecs ... so it could be as long as 10msecs before an incoming connection is noticed.

Ideally what we want to do is perform measurements on the library and see where time is being spent. This is called profiling. Unfortunately, it may be quite some time before this makes it to the top of my to-do list. I'll be delighted to review profiling traces and work with anyone who has specific changes that need made.

squonk11 commented 6 years ago

I configured the FreeRTOS Tick rate to 1000Hz. So I think the time slice should be 1ms in my cases. Also the log level in menuconfig is set to "Debug" in both cases.

squonk11 commented 6 years ago

Now I tried to do some profiling using the logging function. In order to have more realistic data I increased the baudrate to 921600Bd. Using this I see the log output:

D (20331) HttpServerTask: Waiting for new peer client D (20331) Socket: >> accept: Accepting on 0.0.0.0 [8000]; sockFd: 4096, using SSL: 0 D (20334) Socket: - accept: Received new client!: sockFd: 4099 D (20334) Socket: << accept: sockFd: 4099 D (20335) HttpServerTask: HttpServer listening on port 8000 received a new client connection; sockFd=4099 D (20336) HttpParser: >> parse: socket: fd: 4099 D (20340) HttpParser: >> parseRequestLine: "POST /api/v1/pr HTTP/1.1" [24] D (20341) HttpParser: << parseRequestLine: method: POST, url: /api/v1/pr, version: HTTP/1.1 D (20367) HttpParser: << parse: Size of body: 23 D (20368) HttpRequest: Method: POST, URL: "/api/v1/pr", Version: HTTP/1.1 D (20369) HttpRequest: name="accept", value="application/json, text/javascript, application/xml, text/plain, text/html, ." D (20370) HttpRequest: name="accept-encoding", value="gzip, deflate" D (20370) HttpRequest: name="accept-language", value="de,en-US;q=0.7,en;q=0.3" D (20371) HttpRequest: name="connection", value="keep-alive" D (20372) HttpRequest: name="content-length", value="23" D (20373) HttpRequest: name="content-type", value="application/x-www-form-urlencoded; charset=utf-8" D (20374) HttpRequest: name="host", value="192.168.1.8:8000" D (20374) HttpRequest: name="origin", value="null" D (20375) HttpRequest: name="user-agent", value="Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:57.0) Gecko/20100101 Firefox/57.0" D (20377) HttpRequest: Body: "p=991&ds=0&ba=1&type=xx" D (20377) HttpServerTask: >> processRequest: Method: POST, Path: /api/v1/pr D (20378) PathHandler: plain matching: /api/v1/pr with /api/v1/pr D (20379) HttpServerTask: Found a path handler match!! D (20380) paraReadHandler: Request-Path: /api/v1/pr D (20380) HttpRequest: >> parseForm D (20381) HttpRequest: Processing: p=991 D (20381) HttpRequest: p = "991" D (20382) HttpRequest: Processing: ds=0 D (20382) HttpRequest: ds = "0" D (20382) HttpRequest: Processing: ba=1 D (20383) HttpRequest: ba = "1" D (20383) HttpRequest: Processing: type=xx D (20384) HttpRequest: type = "xx" D (20384) HttpRequest: << parseForm D (20385) paraReadHandler: count=4;P=991; DS=0; BA=1 D (20391) paraReadHandler: Read result: 0:022000000011FF81FF81E39000850400000000090.100.40;;; D (20391) HttpResponse: >> sendData D (20392) Socket: send: Binary of length: 76 D (20392) Socket: send: Raw binary of length: 76 D (20395) Socket: send: Binary of length: 64 D (20396) Socket: send: Raw binary of length: 64 D (20397) HttpResponse: << sendData D (20397) Socket: close: m_sock=4099, ssl: 0 D (20397) Socket: Calling lwip_close on 4099 D (20399) Socket: close: m_sock=-1, ssl: 0 D (20400) HttpServerTask: Waiting for new peer client

The total duration of this cycle is approx. 70ms. From this I see that there are mainly three code sections which take longer:

  1. HttpParser::parse() takes approx. 31ms
  2. paraReadHandler() takes approx. 6ms
  3. HttpParser::dump() takes approx. 9ms

The remaining code sections take mostly just 1ms which probably is just the time for logging. This imo gives me the following important feedbacks: a) probalby the parser could need some optimizations? b) since even my Handler function (paraReadHandler()) takes just as long as the whole cycle using Mongoose library ( 5 - 10ms) I think the speed of Mongoose can only be achieved if there is some parallelism in the code: meaning e.g. using one µC core for processing the Http protocol and one µC core for the Handler. What is your opinion? Do you see some chances for optimizations?

Best regards

squonk11 commented 6 years ago

Analysing the HttpServer code I see that one reason for the slower speed compared to Mongoose is related to the HttpServer run() task. Here a new connection is only accepted if the previous request is handled and closed. I think that Mongoose continuously accepts new requests, puts them in a queue and another task (maybe on another core) processes the request. Would it be difficult to implement this behaviour also in your HttpServer class?

nkolban commented 6 years ago

Looking at the design now. I'll be delighted to work with you on improving this. Let's discuss our options. Right now the HTTP server is indeed single threaded. What this means is that when a request comes in it has to be completed before the next one is processed. I'm not sure that Mongoose does much differently. When a request arrives, it signals an event and the processing loop of their event handles the work.

What we could do in our solution is spin up a new FreeRTOS task but my fear is that the cost of doing that would also be very expensive.

squonk11 commented 6 years ago

I had one issue in my menuconfig settings: the clock frequency of the serial flash was 40MHz only. After setting it to 80MHz the execution times for previously listed code sections are now:

  1. HttpParser::parse() takes approx. 20ms
  2. paraReadHandler() takes approx. 4ms
  3. HttpParser::dump() takes approx. 6ms

But even after that Mongoose is much faster than HttpServer. I am testing this doing a benchmark by reading 700 parameters via REST from my device. Using Mongoose this takes approx. 5 seconds; using HttpServer this takes approx. 35 seconds. But I also observe that Firefox pushes the REST requests much faster to the Mongoose webserver than to the HttpServer. From this I derive that Mongoose is continuously accepting new request and calculates the responses "in the background". So I think the solution needs to be like this: the run() Task of the HttpServer should just queue the incoming request (maybe after some preprocessing) and then there should be a background task (maybe on the second core of the ESP32) which calculates the response and dispatches the request from the queue. The difficult task here could be to find a proper load balancing between the cores. Sure I would like to help you to do this job - but unfortunately my free time for this is very limited. But if you also have time, why not doing it together.

squonk11 commented 6 years ago

I would like to be a bit more precise on above performance measurements:

  1. Using Mongoose: time from first request to last request: 5.809s time from first request until last response from Mongoose: 5.842s
  2. Using HttpServer: time from first request to last request: 28,692s time from first request until last response from HttpServer: 28.907s

With HttpServer there are (almost) always one telegram arriving at the ESP32 and then one telegram leaving the ESP32. With Mongoose there are mostly bunches of telegrams arriving and then leaving. So, I think this could be an indication that Mongoose is collecting requests and processing them in parallel in a separate task. These measurements were done using Wireshark.

squonk11 commented 6 years ago

In order to increase the speed of the HttpServer I tried to analyze your code for the parser (since this part takes already 20ms): I see that you are reading from the socket byte by byte. Isn't there a faster way; e.g. by reading bigger chunks of data? Isn't it easier to read all received data into a single std:string and analyze there? Unfortunately I am neither skilled in TCP/IP nor http - so I don't know if there is a possiblity to read the full request in one shot; or is the request just a part of a datastream with hypothetically no end?

snahmad commented 6 years ago

Have this HttpServer slow is sorted? Have you used mongoose WebServer?

squonk11 commented 6 years ago

I tested mongoose. It is really much faster than HttpServer. I think this is because they work with a very low level API of lwip. But unfortunately, mongoose is not free for commercial use.

snahmad commented 6 years ago

oh ok. mongoose is not free for commercial use

snahmad commented 6 years ago

Have you used mongoose C++ Wrapper class WebServer? have you used with SSL?

squonk11 commented 6 years ago

no, I downloaded the sources from mongoose website and played al little bit with that. I did not test with SSL.

snahmad commented 6 years ago

ok, as mongoose is not free for commercial use. only option is to use HttpServer. But HttpServer is not mature to be used with SSL. It works without SSL. Now we have FTPServer class as well without SSL. I will try to use to upload web files.

I am surprised that these low level classes are not mature enough to be used commercially. I am very disappointment with SDK and c++ support. I thought esp32 is very famous plafoorm for IOT dev with web server all working. I also find serving web files using sd card is also slow and some time fail for big size files. I am using angular cli. It takes few MB in total. I am using new esp32 board with external ram of 4MB. I enabled it, but cause mongoose web server to crash. will test with HttpServer class.

I will try external ram mount on FAT FS and try serving web files from Http server.

We are thinking raspberry pi with embedded linux as well for IOT dev. It should be more mature.

chegewara commented 6 years ago

This is private project, made for free in private time. If you dont like @nkolban work feel free to find somewhere else or change platform. You can also fork it and change code to fit your needs. To me this code is more mature than you and your posts here.

nkolban commented 6 years ago

@snahmad I think it is safe to say that for many of us, tinkering with ESP32 is a passion and hobby. The source of these projects on this repository are free for all to use and represent many hours of donated time by many good folks. In 2016 and 2017, I had a world of free time to work on these and tried to keep up with any and all questions that came up. Mr @chegewara and others have done the same and it is a community effort. As folks find problems, changes are made and merged quickly. Everyone and anyone is free to make a change and encouraged to commit those back for the benefit of all.

At the start of 2017, a 3rd party contacted me and asked if I would assist them with a major ESP32 project. Sadly I only have so many hours in the day (beyond my 9-5 career). They offered to compensate me for my time and I gladly accepted. What that means is that I haven't recently spent nearly as much time on this repository as I had in the past or plan to in the future.

Now ... this all said ... there are a number of folks here that have excellent skills on these projects and would be delighted to assist in polishing them up and making them more robust, better or even customized. However, if you are looking for extra value and can't afford the time to understand the source code and make changes yourself (which is not a dig on you in any way ... that is a normal course of events) then consider outsourcing any needs you may have with financial compensation for time. This would give you dedicated expertise, a higher probability of project success and allow you to focus on your core project by leaving the specialty work to those who have skills in this area.

As an example, I own a house. My kitchen has worn away in the last 20 years. I need a new kitchen. My kitchen is usable as it is today but it isn't "great" and I want more. I have two choices ... I can invest time and study on kitchen remodeling and do it myself. However that would be a lot of time I would spend for a one-off project ... also I doubt I could do it anywhere nearly as well as someone who has done it many times before. The plus is that it is free and the materials for doing it are also free (There are many videos on doing such). The alternative, is that I pay a contractor to come in, listen to my vision of the finished kitchen and he will plan, design, and implement my new kitchen for me. However ... he won't do this for free.

snahmad commented 6 years ago

Hi Kolban,

I am not blaming any one. Sorry If my comments hurts some one. I do appreciate efforts you people made over the years to have nice c++ classes which works in most use cases. I made some improvement in HttpServer which i posted as zip file in separate thread which fixed SSL partially. I was only surprised that not many people are using SSL feature and test these classes with SSL enabled.

OK, I will discuss with my manager if you need assistance in polishing some code. I understand not for free.

Just need discussion with you people your experience. I am new to esp32 only few months. I start understanding now code base. There are few other options i need to explore as well based on our project requirements.

I better discuss with any one of offline over skype if you are available for 30 minutes. I can elaborate my project requirements you can suggest best platform for it.

Thanks, Naeem

chegewara commented 6 years ago

https://github.com/espressif/esp-idf/tree/master/examples/protocols/openssl_server

chegewara commented 6 years ago

Im guessing HttpServer does not work with SSL, because its not possible in form it is now. To work with SSL you need to use SSL certificates and that library does not use any. Ergo, it wont work.

snahmad commented 6 years ago

Sure. openssl_server example works. I tested. I was trying to use HttpServer class. I guess we can write new secured HttpServer class based on openssl_server code as c++ class. which server web pages reading from file and handle post method call. I am try myself.

chegewara commented 6 years ago

i dont know what kind of project are you trying to build, but im guessing that esp32 will be connected to internet thru some sort of router. I have my home server based on synology NAS, this server allows me to handle reverse proxy. This server also let me to proxy https requests to http server. All security with certificates is handled by my home server, so esp32 does not need to have secure server, because all security is handled by faster and stronger hardware Its just a thought how to simplify this

nkolban commented 6 years ago

I'm not sure the capability isn't already present ... looking at the constructor I find the following:

https://github.com/nkolban/esp32-snippets/blob/master/cpp_utils/HttpServer.h#L86

which seems to offer up:

void start(uint16_t portNumber, bool useSSL=false);

Checking further, it appears that the function appears implemented but VERY poorly documented. It seems that we have to leverage SSLUtils class to tell the environment about our local certificates before starting the HTTP Server.

snahmad commented 6 years ago

I tried out using SSL with HttpServer. found issues and fixed one issue See this thread for updated Http classes. https://github.com/nkolban/esp32-snippets/issues/527

Still not serving web pages over https link. did not investigate. you can try at your end,

I may not need in my first version HTTP with SSL. our device my site behind firewall with private ip address.

I need to look other issues as why SD card slow to server web pages and external ram usage etc.

chegewara commented 6 years ago

Yes, i see it now. Certificates are used here: https://github.com/nkolban/esp32-snippets/blob/master/cpp_utils/Socket.cpp#L531-L532

snahmad commented 6 years ago

Can you try out Https server. let me know if it works for you. you can any web browser to test easily. Use any self-signed certificate to start HttpServer. I will appreciate it.

squonk11 commented 6 years ago

@snahmad : yo say that SD card is very slow. Are you usind one or four data lines?

snahmad commented 6 years ago

not sure. I have attached my SD card code FATFS_SD_CARD.zip

squonk11 commented 6 years ago

It seems as if you are using the SPI peripheral - this is using 1-Bit mode only. I also started in this way but I also had the impression that the SD card is slow. So I changed to using the SDMMC peripheral which uses 4-Bit mode. This seems to be faster. But I did not do any measurements for a real comparison. But I also remember that it was not so easy to activate the SDMMC mode: I had to activate another way to do the SW download and burn one of the internal fuses due to a parallel usage of GPIO12. You can read about this here: https://github.com/espressif/esp-idf/tree/master/examples/storage/sd_card I also remember that there was the possibility to select the clock speed for the SD card interface - maybe in menuconfig?

snahmad commented 6 years ago

ok. I will try to use SDMMC peripheral. Thanks for your input.

snahmad commented 6 years ago

Have you used SD card with mongoose web server successfully to serve web files. Have you used external RAM.

squonk11 commented 6 years ago

yes, in my simple tests with mongoose I used the SD card to serve files from there and it worked. Until now I did not use external RAM because my ESP32 does not have it. I am currently thinking about buying an ESP32 with external RAM because SSL takes a lot of memory (>32kB per connection). So, in my program I can serve maximum 3 or 4 connections simultaneously. I hope that the additional external memory (4MB) will solve that issue.