nicolasff / webdis

A Redis HTTP interface with JSON output
https://webd.is
BSD 2-Clause "Simplified" License
2.82k stars 307 forks source link

core dumped when subscrib to redis #209

Closed CUEB-wyx closed 2 years ago

CUEB-wyx commented 2 years ago

Thanks for your help.

When I use websocket to subscribe to redis, as long as I close the browser, the server will report an error and stop.

    tickSocket.onopen = function() {
      console.log("tick socket connected!");
      tickSocket.send(JSON.stringify(["SUBSCRIBE","data_stream"]));
    };

root@VM-0-6-ubuntu:/usr/local/webdis# ./webdis webdis.json [err] buffer.c:565: Assertion buffer->refcnt > 0 failed in evbuffer_decref_andunlock Aborted (core dumped)

nicolasff commented 2 years ago

Hello @CUEB-wyx, thanks for reporting this crash! Sorry I couldn't answer this earlier.

I have a few questions for you before I can start investigating this issue:

  1. Which version of Webdis does this happen with? You can find it in the log file, it looks like this: [3590] 02 Dec 08:18:54 I Webdis 0.1.18 up and running. If you're not sure where the log file is located, you can find its path in webdis.json.
  2. Can you make sure it says "websockets": true in webdis.json?
  3. It looks like you're running this in Ubuntu, maybe in a VM. Which version of Ubuntu and on which architecture? (e.g. i386, x86_64, arm64…)
  4. Where did the Webdis binary come from? Did you built it from source? Did you install it with a package manager? Something else?
  5. When you say "as long as I close the browser, the server will report an error and stop", do you mean this happens when you close the browser? Like as soon as you close it?

If you could answer those this would be super helpful, since there are many versions of Webdis being used on various platforms so if I can reproduce the issue myself this would be a great first step before fixing it or suggesting a solution.

Thanks!

CUEB-wyx commented 2 years ago

@nicolasff

Thanks for your reply! I'm sorry I didn't describe my problem clearly. 1.The version of Webdis is 0.1.19-dev. This is the log file: [32557] 29 Nov 09:00:08 I Webdis 0.1.19-dev up and running.

  1. I make sure "websockets": true in webdis.json.
  2. The version of Ubuntu is : Linux version 4.15.0-142-generic (buildd@lgw01-amd64-036) (gcc version 7.5.0 (Ubuntu 7.5.0-3ubuntu1~18.04)) #146-Ubuntu SMP Tue Apr 13 01:11:19 UTC 2021
  3. I use this installation method: git clone git://github.com/nicolasff/webdis.git cd webdis make
  4. I'm very sorry for my poor English. I mean to say , this happens as soon as when I close the browser. Thank you very much for your time!
nicolasff commented 2 years ago

Thanks, this is perfect! And don't worry, there's no problem with your English :-)

I'll look into this shortly.

nicolasff commented 2 years ago

Ok, I was able to reproduce this easily. The work done in #199 should have made the Websocket code more reliable, but it turns out to be related to this change. This report also confirms that the feature is not quite ready yet to be enabled by default.

I used Docker to run Wbedis under Ubuntu 18.04 and got the same error:

[err] buffer.c:565: Assertion buffer->refcnt > 0 failed in evbuffer_decref_and_unlock_
Aborted

Running in GDB gave a backtrace:

Thread 2 "webdis" received signal SIGABRT, Aborted.
[Switching to Thread 0x7f093af85700 (LWP 184)]
__GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
51  ../sysdeps/unix/sysv/linux/raise.c: No such file or directory.
(gdb) bt
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
#1  0x00007f093afc6921 in __GI_abort () at abort.c:79
#2  0x00007f093b5bd82a in ?? () from /usr/lib/x86_64-linux-gnu/libevent-2.1.so.6
#3  0x00007f093b5bdda8 in event_errx () from /usr/lib/x86_64-linux-gnu/libevent-2.1.so.6
#4  0x00007f093b5a41ed in evbuffer_decref_and_unlock_ () from /usr/lib/x86_64-linux-gnu/libevent-2.1.so.6
#5  0x00005560bc20a845 in ws_client_free (ws=0x7f0934003640) at src/websocket.c:128
#6  0x00005560bc20bead in ws_can_write (fd=36, event=4, p=0x7f0934003640) at src/websocket.c:610
#7  0x00007f093b5b6234 in ?? () from /usr/lib/x86_64-linux-gnu/libevent-2.1.so.6
#8  0x00007f093b5b4a11 in ?? () from /usr/lib/x86_64-linux-gnu/libevent-2.1.so.6
#9  0x00007f093b5b533f in event_base_loop () from /usr/lib/x86_64-linux-gnu/libevent-2.1.so.6
#10 0x00005560bc204604 in worker_main (p=0x5560bd80d160) at src/worker.c:179
#11 0x00007f093b37e6db in start_thread (arg=0x7f093af85700) at pthread_create.c:463
#12 0x00007f093b0a771f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

@jessie-murray this seems related to your WS overhaul. I traced this to the following scenario:

  1. WS client connects, Webdis registers both EV_READ and EV_WRITE events and attempts to send the handshake response.
  2. ws_can_write is called, sends the full handshake, sets scheduled_write to zero but leaves scheduled_read to 1 as it waits for WS data from the client.
  3. The browser closes the socket, and this is where things start going wrong. I see ws_frame_and_send_response being called (is this the WS_CONNECTION_CLOSE frame? it seems like Webdis shouldn't attempt to send it when the client disconnects).
  4. To send this frame, ws_schedule_write is called, which sets scheduled_write to 1. ws_monitor_input is also called, but this is ignored since the read is already scheduled.
  5. ws_can_read triggers, evbuffer_read returns 0, so ws_client_free is called.
  6. Then ws_can_write is called for the scheduled frame, and evbuffer_write_atmost returns -1 since we've closed the connection. As a consequence ws_close_if_able is called and ws_client_free is called AGAIN… leading to a double free.

It seems to me that the WS_CONNECTION_CLOSE frame shouldn't be sent if the client is gone, but also it feels like all calls to ws_client_free could be gated by ws_close_if_able since it already checks for other scheduled events.

@jessie-murray Thoughts? Here's what I used to reproduce the issue:

Dockerfile & repro steps In `Dockerfile.ubuntu`: ```dockerfile FROM ubuntu:18.04 RUN apt-get -y update && apt-get -y upgrade && apt-get -y --force-yes install git wget \ make gcc libevent-dev libmsgpack-dev python3 redis-server vim gdb valgrind RUN mkdir /tmp/webdis COPY . /tmp/webdis/ RUN cd /tmp/webdis && find src -name '*.d' -delete && find src -name '*.o' -delete && sed -i 's/-O3/-g -O0/g' Makefile && make -j8 clean all install RUN echo "daemonize yes" >> /etc/redis/redis.conf && \ sed -i 's/"daemonize":.*true/"daemonize": false, "websockets": true/g' /etc/webdis.prod.json && \ sed -i 's#"logfile":.*#"logfile": "/dev/stderr"#g' /etc/webdis.prod.json && \ sed -i 's/bind 127.0.0.1 ::1/bind 127.0.0.1/g' /etc/redis/redis.conf && \ rm -f /var/log/redis/redis-server.log CMD /usr/bin/redis-server /etc/redis/redis.conf >/dev/null 2>/dev/null && cat /etc/webdis.prod.json && /usr/local/bin/webdis /etc/webdis.prod.json EXPOSE 7379 ``` Built with: ```shell docker build -t webdis:ubuntu-18-04 -f Dockerfile.ubuntu . ``` Ran with: ```shell docker run --rm -ti -p127.0.0.1:7379:7379 -v $(pwd):/tmp/webdis webdis:ubuntu-18-04 /bin/bash service redis-server start cd /tmp/webdis make clean make -j8 DEBUG=1 gdb ./webdis ``` I opened `tests/websocket.html` in a browser, clicked Connect, and reloaded the page. This is 100% reproducible.
nicolasff commented 2 years ago

@CUEB-wyx this was fixed by #210; please try again with the latest master.

I'll release this in 0.19.0 pretty soon, I'm still experimenting with Docker manifests to see if I can bundle both an amd64 and an arm64 image under the same name, so it might be a few days until the fix is included in a release. The main challenge is finding a way to use manifests and still keep the same signatures as have been included since 0.12.0, and I think I'm close to having a final release script that uses both.

cilex-ft commented 2 years ago

This fixes an issue we also had @nicolasff ! We couldn't run webdis in websocket mode (with authenticated redis) - now runs smoothly.