EspHttpServer: function in ws_handler gets called on any other http request

esp-rs / esp-idf-svc

Type-Safe Rust Wrappers for various ESP-IDF services (WiFi, Network, Httpd, Logging, etc.)

https://docs.esp-rs.org/esp-idf-svc/

Apache License 2.0

325 stars 179 forks source link

EspHttpServer: function in ws_handler gets called on any other http request #348

Open IvanSanchez opened 9 months ago

IvanSanchez commented 9 months ago

I have a HTTP server set up like this:

let mut server = EspHttpServer::new(Configuration::default())?;

server.fn_handler("/index.html", Method::Get, move |request| {
    info!("GET /index.html");
    /* snip */
})?;

server.ws_handler("/ws", |ws| {
    info!("Websocket /ws");
    if ws.is_closed() {
        info!("Closed WebSocket session");
        return Ok::<(), anyhow::Error>(());
    }
    /* snip */
})?;

When I do a just GET request (via e.g. curl http://esp32_ip_addr:80/index.html), the serial log looks like:

I (5006) esp_idf_svc::http::server: Registered Httpd server handler Get for URI "/index.html"
I (5016) esp_idf_svc::http::server::ws: Registered Httpd server WS handler for URI "/ws"
 (snip)
I (81335) myproject::webserver: GET /index.html
I (81925) myproject::webserver: Websocket /ws
I (82245) myproject::webserver: Closed WebSocket session

Therefore I must conclude that functions registered with ws_handler are called when they shouldn't, even though the ws_handler calls are sending the URI to the underlying esp-idf C lib.

For reference, my Cargo.toml file contains the following crate versions:

esp-idf-sys = "0.33"
esp-idf-hal = "0.42"
esp-idf-svc = "0.47"
embedded-svc = "0.26"

And .cargo/config.toml contains:

[build]
target = "xtensa-esp32-espidf"
[env]
ESP_IDF_VERSION = "release/v4.4"

Couple of remarks:

This triggers just after flashing & rebooting the board (via cargo run)
I'm making sure that I'm not making any requests to /ws at all, and that I don't have any lingering web browser windows.
This triggers on at least two different dev boards: a ESP32 Devkit-C v4 (with a ESP32-WROOM-EU) and one of those white Heltec devkits (with a ESP32-D0WDQ6). Both are xtensa-esp32-espidf targets.
This also triggers after changing .cargo/config.toml to use ESP_IDF_VERSION="v5.1.2" and rebuilding.

IvanSanchez commented 9 months ago

Possibly related upstream issues: https://github.com/espressif/esp-idf/issues/11097

ivmarkov commented 9 months ago

Possibly related upstream issues: espressif/esp-idf#11097

~~I don't think this link is related at all to the issue you are describing, and what you are describing sounds like a gross error... almost an impossible one!~~

~~Would you mind uploading your test project somewhere? I would like to verify that separately.~~

IvanSanchez commented 9 months ago

Would you mind uploading your test project somewhere?

Sure, it's at https://gitlab.com/IvanSanchez/solanera . But beware that this is my "learning rust" project, so I'm not really proud of code quality.

ivmarkov commented 9 months ago

Actually, I checked the bug twice and I'm changing my mind. There's a chance that it might be the root cause, I just can't believe it happens on the first hit by curl. Maybe you are oversimplifying the story? Anyway and regardless. Thinking loud now:

Looking at how long it took the Espressif folks to suggest a (non-working) workaround, I don't think this issue would be fixed soon. Maybe the question is - rather - how severe the problem is actually? What I mean is, if the ws session which is passed to the ws handler - in the problematic case - is reported as closed, then would you be actually having a real problem at all? Perhaps not, as in the case of a closed session, you anyway cannot do much, if anything?

Maybe this is the reason I did not catch this problem earlier e.g. here. If you look how I'm processing the websocket connections, if the WS connection is reported as closed by the server, yet it is not in my list, I just discard it.

So why don't you try just that and see if it gets you somewhere?

ivmarkov commented 9 months ago

BTW I'm digressing, but... this loop will block the HTTP server from processing anything else besides your just-created web-socket.

Reason: the ESP IDF HTTP server is single threaded.

IvanSanchez commented 9 months ago

Maybe you are oversimplifying the story?

...not really. I've simplified the code because of course I'm raising the wifi interface before starting the webserver and registering half a dozen handlers for GETs and not just one, but I cannot see how that might impact the situation.

The WS connection is handled at the first curl request, though.

how severe the problem is actually

After realizing my fumble with the WS loop, I'm not fully sure. But having a WS handle do work on connections that are not WS connection adds quite a lot of undefined behaviour - I have no way of knowing if a EspHttpWsConnection is really a websockets connection, or an artifact from a GET/POST request, and therefore I cannot really trust its "is closed" state.

the ESP IDF HTTP server is single threaded.

On a side note, that sounds like something that should be in the docstrings ;-)

ivmarkov commented 9 months ago

how severe the problem is actually

After realizing my fumble with the WS loop, I'm not fully sure. But having a WS handle do work on connections that are not WS connection adds quite a lot of undefined behaviour - I have no way of knowing if a EspHttpWsConnection is really a websockets connection, or an artifact from a GET/POST request, and therefore I cannot really trust its "is closed" state.

You are mixing up things. The fact that the server is single threaded has nothing to do with the problem at hand. As for you not really trusting its "closed" state - looking at the ESP IDF code I think you can trust that actually, which is good news.

the ESP IDF HTTP server is single threaded.

On a side note, that sounds like something that should be in the docstrings ;-)

Maybe. Or maybe not. Not sure how knowing that the server is single-threaded would help you. What I mean is that even if the server was multithreaded, the loop {} you have in there is not ideal, as WS connections are generally duplex. Meaning, the client can send you stuff at any time and therefore ideally you need to process the incoming WS frames separately from your code that would be sending frames to the client; while your code looks like if you want to do some sort of http-like request-response communication on top of WS. Now if you are consciously doing that (to overlay your own simplex protocol on top of WS), that's another story. Not sure why you are going WS then though, as a simple(r) REST API might do as well. Still not holding my breath a simplex approach would be easily possible - at least if your client is a an async web UI based on React and/or replicas. Of course you might be targetting a different client than a browser... ;)

IvanSanchez commented 9 months ago

After realizing my fumble with the WS loop, [...]

You are mixing up things.

What I meant to say is: I was experiencing HTTP requests hanging indefinitely (HTTP request triggers a WS handler that loops infinitely waiting for a frame to be sent). After being aware of the single-threaded nature of EspHttpServer, I can work around the issue.

[...] something that should be in the docstrings ;-)

Maybe. Or maybe not. Not sure how knowing that the server is single-threaded would help you.

My brain is poisoned by too much Javascript. Which means that my first instinct is to think that everything is automagically async, and that a request handler spawns its own thread. (In other words: I wrongly believed that somehow, if a request handler blocks, the server will handle other requests).

Ideally, documentation would allow other people like me not make the wrong assumptions.

OTOH: I'm aware that WS offers duplex capabilities. I'm not planning on using them (for now).

ivmarkov commented 9 months ago

After realizing my fumble with the WS loop, [...]

You are mixing up things.

What I meant to say is: I was experiencing HTTP requests hanging indefinitely (HTTP request triggers a WS handler that loops infinitely waiting for a frame to be sent). After being aware of the single-threaded nature of EspHttpServer, I can work around the issue.

[...] something that should be in the docstrings ;-)

Maybe. Or maybe not. Not sure how knowing that the server is single-threaded would help you.

My brain is poisoned by too much Javascript. Which means that my first instinct is to think that everything is automagically async, and that a request handler spawns its own thread. (In other words: I wrongly believed that somehow, if a request handler blocks, the server will handle other requests).

Not sure JS is the best example of a multi-threaded execution at which you are alluding, as JavaScript is single threaded actually and the way it gets around with it is by using promises, which are processed off the single-threaded message loop. And the async/await syntax makes that mostly bearable. But it is not automagic because - you know - it introduces coloring in the language. :p

Rust also has async/await, but you probably don't want to get there, if you are just starting with it. As you have to learn lifetimes and trait metaprogramming first. And gazillion of other things, if you haven't yet...

Anyway, I'm digressing but the reason why I'm mentioning async/await is because having a single-threaded (or little-threaded) embedded web server... ( using less or only one thread is important in the embedded world as each thread consumes its own, fixed, upper-bounded stack of precious ram, and Rust tends to consume stack memory like mad due to its move semantics ) ... yet still having IO-bound multi-tasking usually implies async/await in Rust. But because the ESP IDF HTTP server is not non-blocking (how could it be, given that it is written in C, and nonblocking IO in C is not ergonomic), we are approaching its limits. It is OK for a static web page, perhaps some slim REST, and very carefully designed WS.

For more complex use cases, I hope my edge-http will pick up steam soon.

Ideally, documentation would allow other people like me not make the wrong assumptions.

Yeah yeah. I think you are overemphasizing it. I might be getting cynical with time, but I think reading code (including ESP IDF C code) is unavoidable.