WebSocket cleanup on stopping crow::App

kang-sw commented 2 years ago

Firsly, I really appreciate your amazing work. It literally changed my life :D.

And apologies for my short English skills as I don't natively use it.

The text below is logging from my application which is under development currently.

---> Normal cycle: accept - open - close
[2022-08-28 08:45:44.399] [__PERFKIT:BACKEND] [info] [web_terminal.cpp:185] * Accepting terminal WebSocket from: 127.0.0.1
[2022-08-28 08:45:44.399] [__PERFKIT:BACKEND] [info] [web_terminal.cpp:192] * Opening WebSocket: 127.0.0.1
[2022-08-28 08:45:47.139] [__PERFKIT:BACKEND] [info] [web_terminal.cpp:163] * (127.0.0.1) Closing websocket: �
[2022-08-28 08:45:47.139] [__PERFKIT:BACKEND] [info] [web_terminal.cpp:180] * WebSocket closed. (0.000 seconds)
---> 

---> Stopping application while there are any active websocket connection:
---> Cleanup logic is not called!
[2022-08-28 08:45:47.562] [__PERFKIT:BACKEND] [info] [web_terminal.cpp:185] * Accepting terminal WebSocket from: 127.0.0.1
[2022-08-28 08:45:47.562] [__PERFKIT:BACKEND] [info] [web_terminal.cpp:192] * Opening WebSocket: 127.0.0.1
[2022-08-28 08:45:51.855] [info] CMD: quit
[2022-08-28 08:45:51.855] [__PERFKIT] [trace] invoking as tokens
[2022-08-28 08:45:51.855] [info] Disposing application
[2022-08-28 08:45:51.855] [info] Destroying application instance
[2022-08-28 08:45:51.855] [info] Disposing terminal.
(2022-08-27 23:45:51) [INFO    ] Closing IO service 0000018D4EEB7190
(2022-08-27 23:45:51) [INFO    ] Closing main IO service (0000018D4EF028A0)
(2022-08-27 23:45:51) [INFO    ] Exiting.

It seems active websockets does not receive onclose() callback call on application disposal.

It causes problem as I usually dynamically allocate userdata object on onaccept() routine, to implement notification without explicit client request. (Holding connection references outside of on*() callback routines will make calling send_text|send_binary() at random timing available)

Currently I'm releasing allocated memory from onclose() callback routine. However, if it's not properly called on crow::App's stop sequence, it results in memory leak.

I just found your TODO message inside of stop() method after writing this ... I believe you have a plan to resolve this. Please just take below content as an idea ...

It seems there's already logic to handle it, however, it seems there are a few problems with it.

Newly connected websocket instances are not registered within Crow::::websockets_ field automatically. Thus, user have to make call to add_websocket() and remove_websocket() on their own websocket handler routine.
However, any access to websocket_ field is not properly protected with any mutex or critical section. Link to add_websocket, remove_websocket
And, even if websocket_ access routines are made atomic, as they are invoked prior to actual http_server::stop() call, new websocket connection can be made during existing websocket cleanup routines, which will make add_websocket call, after copying websocket_to_close content.

kang-sw commented 2 years ago

Additionally, what do you think about replacing websocket::connection::userdata_ field as shared_ptr<void> with custom destructor?

https://github.com/CrowCpp/Crow/blob/c0bfaa7709fe9698c0989788ddee6c2cdf254043/include/crow/websocket.h#L33-L37

I think userdata assignment logic can be changed into shared_ptr<void> implementation without breaking current user code. This lets user to notice connection instance expiration immediately, and make memory management much more simple. I believe there are a lot more advantages with it.

    void userdata(void* u) { userdata_ = shared_ptr<void>(u, [](auto) {}); }
    void userdata(shared_ptr<void> u) { userdata_ = move(u); }
    void* userdata() { return userdata_.get(); }
    std::shared_ptr<void> get_userdata() { return userdata_; }

private:
    std::shared_ptr<void> userdata_;

Accept registration logic can be changed into:

https://github.com/CrowCpp/Crow/blob/c0bfaa7709fe9698c0989788ddee6c2cdf254043/include/crow/routing.h#L512

        std::function<bool(const crow::request&, shared_ptr<void>)> accept_handler_;

        template<typename Func, class = std::enable_if_t<std::is_invocable_v<Func, crow::request, shared_ptr<void>*>>>
        self_t& onaccept(Func f)
        {
            accept_handler_ = f;
            return *this;
        }

        template<typename Func, class = std::enable_if_t<std::is_invocable_v<Func, crow::request, void**>>>
        self_t& onaccept(Func f)
        {
            accept_handler_ = [f_org = std::move(f)](auto&& req, shared_ptr<void>* o) {
               void* p = nullptr;
               f_org(req, &o);
               *o = shared_ptr<void>(p, [](auto){});
            };
            return *this;
        }

MichaelSB commented 2 years ago

I do basically the same. In my app I use a collection of std::unique_ptrs holding my allocated user_data. This way my instances are destroyed when the application shuts down without relying on onclose being called. In onclose I remove the instances from the collection, freeing associated memory in the process.

dranikpg commented 2 years ago

@kang-sw @MichaelSB Sorry for the wait. The updates somehow got lost in my inbox 😰

There are indeed numerous bugs in the websocket code. First of all, as you pointed out, the connections don't get registered. Second, websocket::connection->close() doesn't really run the close handler when called manually, because has_recv_close_ is not set. Furthermore, App->stop() is not called when the application is terminated via a signal.

I haven't been working with the websockets code at all during my short time with Crow. However, I can imagine those things are not that difficult to fix. I'll tackle this in the coming week or two.

If you want, you can open a PR yourself (I'll promise to look at it in a timely manner). Make sure to cover the points above and notify beforehand if you start working.

As to the userdata replacement. I'm not sure, this seems convenient if some cases, but a little more difficult to use in others. If I'm actually storing a pointer to some static data, I would have to use some nonetheless.

MichaelSB commented 2 years ago

I agree on the shared_ptr. For me it would be worse. Also, when using it with some C-Api a shared_ptr would most likely be hard to implement.

kang-sw commented 2 years ago

@dranikpg Thanks to reply. Yet I'm not fully aware of all HTTP upgrade process, it would take some time to find appropriate fix myself.

@MichaelSB

And actually above userdata replacement won't affect existing user code who uses C-data API.

The code above I wrote exists for compatibility with plain pointer code. void userdata(void* u) { userdata_ = shared_ptr<void>(u, [](auto) {}); }

This creates new shared pointer using custom desturctor which does nothing on destruction. You can freely use raw pointers without any extra work. (Even APIs will be compatible)

Of course it may introduce a bit overhead by creating redundant shared counter object internally, however, it would be trivial as much as keeping single std::string instance.

MichaelSB commented 2 years ago

I see... still I'm not convinced. shared_ptr gives the notion of something that is shared. In many cases nothing is shared. My userdata is unique, that's why I use unique_ptrs and give them as raw ptrs to userdata. shared_ptr have their use, but to me this looks like poor-mans-gc. And of course shared_ptr costs more than storing a string. It always allocates. std::string does not. And though it may not bother you, I try to avoid allocations and useless ref-counts whenever possible. I'm writing high-performance apps, that's why I use C++ and not some platform better suited for webdev. However, I have no voice here, just a guy giving his opinion on the matter :D

kang-sw commented 2 years ago

@MichaelSB I totally agree with your view, as this introduces unnecessary overhead in every use cases currently. However, the reason I prefer using shared_ptr everywhere is not only for implement poor-mans-gc, but also tracking lifetime of the object using weak_ptr.

In current model of websocket implementation, on every callback, websocket instance is delivered as raw c++ reference, and userdata is raw-c-pointer either. And if you hope to implement any server-side notification, you have to store connection's reference somewhere, and must manage its lifetime not to call send_data() on already expired websocket instance. Of course it can be implemented in many clever ways using current userdata() raw pointer, however, it is easy to become error-prone.

Using shared_ptr inherently, we can pretty easily track lifetime of that instance using weak_ptr.

// Main thread: onclose()
  auto wp = std::weak_ptr<void>{conn.userdata_shared()};
  conn.userdata_shared(nullptr); // Try to expire user data object field

  // Wait until all concurrent access to userdata object to be finished.
  while(not wp.expired()) { std::this_thread::yield(); } 

// User thread: who utilizes userdata
   weak_ptr<my_userdata_t> w_userdata;
   ...
   if(auto p_userdata = w_userdata.lock()) {
     // Okay, connection seems valid at this point.
     p_userdata->get_conn().send_data("hello, world!");

     // As above onclose() logic, it waits for all userdata usage finishes, thus it's always safe to access
     // websocket connection instance under this scope. (as long as onclose() callback called in correct manner)
     // (or, above userdata expiration join logic may be embedded inside websocket connection instance destructor)
   } else {
     // We can detect if connection is already expired.
   }

CrowCpp / Crow

WebSocket cleanup on stopping crow::App #529