shadowsocks / shadowsocks-libev

Bug-fix-only libev port of shadowsocks. Future development moved to shadowsocks-rust
https://github.com/shadowsocks/shadowsocks-rust
GNU General Public License v3.0
15.84k stars 5.69k forks source link

Connection reset by peer bug #1038

Closed abigchopstick closed 7 years ago

abigchopstick commented 7 years ago

Please answer these questions before submitting your issue. Thanks!

What version of shadowsocks-libev are you using?

the latest 2.6.0

What operating system are you using?

debian, actually running on UBNT ER-X edgemax system, which is essentially a debian distro.

What did you do?

I'm running ss-redir, tried 4 different servers from 4 different company (services) some with ss, some with ssr (I tried both ss-redir and ssr-redir), all of them have the same problem.

What did you expect to see?

I expect it to run without error.

What did you see instead?

I see connection reset by peer and the CPU usage is 70% as shown below. image

The detailed error message is this: image

Server is working fine for a while, but then with the 70% CPU SS-REDIR will be unresponsive and stop working.

I know that you might say it is a network related problem, but to my defense, when I am using clients on Windows or Mac under the same ISP provider, it doesn't show this error at all: image

So I would say even if it is a network related problem, it is only doing it for ss-redir and not the others. Please help me resolve this issue.

To my knowledge this happens to openwrt version of libev ss-redir too because I've used the same

What is your config in detail (with all sensitive info masked)?

{ "server":"106.186.117.xxx", "server_port":xxxxx, "local_address":"0.0.0.0", "local_port":1080, "password":"xxxxxxx", "timeout":600, "method":"aes-128-cfb", }

Note that I tried at least four other different SS service providers with different encryption, protocol, and even osfd settings (SSR), but they all end up doing the same.

Also I've tried changing timeout to 900 and 1000 but it didn't help.

abigchopstick commented 7 years ago

please note that the xxx.0bad.com from the windows screeenshot has the same IP address as the server config posted above.

madeye commented 7 years ago

Resets in your log are expected and normal. And according to your log and CPU time, everything works well.

madeye commented 7 years ago

Also, please try 2.5.6 , as 2.6.0 is a unstable version.

abigchopstick commented 7 years ago

Except it didn't work, I cannot access blocked IPs, only IPs within China because they are not routed through ss-redir.

Also, this reset is occupying 70% of CPU time at all times, you can't call this "normal".

abigchopstick commented 7 years ago

thanks, how do I check my version please? I'm not so sure at this point.

madeye commented 7 years ago

Run 'ss-redir -v' to check the version number.

In addition, edge routers are never tested. I suggest try routers with OpenWRT instead, which is officially supported.

abigchopstick commented 7 years ago

it is shadowsocks-libev 2.5.6 with OpenSSL 1.0.2h 3 May 2016

Sorry I was mistaken.

I will try to get an openwrt to test it.

abigchopstick commented 7 years ago

Would it help if I provide my router's IP and ssh login to you, so you can take a look at it?

madeye commented 7 years ago

I don't think that will help. Actually, I have a R7000 running 'ss-redir' 7x24 hours, if there is any issue, I even cannot send this comment to you.

abigchopstick commented 7 years ago

I'm pretty sure this bug is related to my ISP, but it is only occurring with my ISP and ss-redir because it doesn't happen on WINDOWS/MAC clients under the same network.

Are there any other LINUX clients with ss-redir support? I was told that GO and PYTHON version doesn't support ss-dir but ss-local.

I'm trying to make this bug occur again on an openwrt router so then you might want to look into it more.

I am also trying to get more people to report similar bugs. I don't think I am alone here.

Also one of my friend is having the same setup on ER-X router, but he has no problem because he is on a different province, however, if I use the same SS server he uses, I will have the same problem.

On Fri, Jan 6, 2017 at 4:13 PM, Max Lv notifications@github.com wrote:

I don't think that will help. Actually, I have R7000 running 'ss-redir' 7x24 hours, if there is any issue, I even cannot send this comment to you.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/shadowsocks/shadowsocks-libev/issues/1038#issuecomment-270853415, or mute the thread https://github.com/notifications/unsubscribe-auth/AXx4mv0JbBo-59J3wJXlbf-eVXmUxDWdks5rPfeegaJpZM4LcbNp .

madeye commented 7 years ago

IMO, your issue looks related to your router's toolchain or kernel. That's why I suggest to use OpenWRT instead.

abigchopstick commented 7 years ago

As I mentioned above, my friend is using the same ER-X router and same OS version 1.9.1. In fact, he compiled ss-redir for me, and he is not having any issues.

Regards,

黄甦 Huang, Su

On Fri, Jan 6, 2017 at 4:22 PM, Max Lv notifications@github.com wrote:

IMO, your issue looks related to your router's toolchain or kernel. That's why I suggest to use OpenWRT instead.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/shadowsocks/shadowsocks-libev/issues/1038#issuecomment-270854589, or mute the thread https://github.com/notifications/unsubscribe-auth/AXx4mheuGpHGpQKYXuCRA62zfhvu7VCrks5rPfm3gaJpZM4LcbNp .

madeye commented 7 years ago

Have you enabled UDP? What's your solution for handling DNS queries?

abigchopstick commented 7 years ago

No I have not, using ChinaDNS for DNS queries.

Actually here's how it is done: http://allenn.cn/articles/2016-10/2016-10-20-edgemax-ss-tutorial/

On Fri, Jan 6, 2017 at 4:27 PM, Max Lv notifications@github.com wrote:

Have you enabled UDP? What's your solution for handling DNS queries?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/shadowsocks/shadowsocks-libev/issues/1038#issuecomment-270855355, or mute the thread https://github.com/notifications/unsubscribe-auth/AXx4mg4ngjVTbFvWi7eVeNspb0l7Fo5oks5rPfsGgaJpZM4LcbNp .

madeye commented 7 years ago

Could you double check it's not an issue of ChinaDNS? To verify this, when your connection is lost, try to visit https://216.58.200.4

madeye commented 7 years ago

After reading the link, I found it's actually using UDP forwarding of ss-tunnel for DNS resolving. It may not work with some ISPs, as UDP traffic could be dropped for QoS purpose.

My suggestion is to use dnsmasq + dnscrypt (TCP mode) for DNS query instead. You can ask your friend to build dnscrypt for you.

abigchopstick commented 7 years ago

Will do, I will check it out first, thanks.

abigchopstick commented 7 years ago

image

Before I try dnscrypt, I stopped ss-tunnel and chinadns, running ss-redir alone, but even when it is the only process running, it does the same thing. image

madeye commented 7 years ago

The log shows that the connection reset happens when receiving data from your devices in your LAN. It looks like a misconfig of the iptables rules. Or in other words, the connection is reset by your client, not your ISP.

Add -v to your ss-redir command line for verbose mode and post your logs here.

abigchopstick commented 7 years ago

image

Instead of a million connection reset by peer, it seems that 1/1000 of them went though as you can see from below log, but the other 999/1000 is still connection reset by peer.

2017-01-06 17:19:23 ERROR: server recv: Connection reset by peer 2017-01-06 17:19:23 ERROR: server recv: Connection reset by peer 2017-01-06 17:19:23 ERROR: server recv: Connection reset by peer 2017-01-06 17:19:23 INFO: redir to 172.217.25.67:443, len=517, recv=517 2017-01-06 17:19:23 ERROR: server recv: Connection reset by peer 2017-01-06 17:19:23 INFO: redir to 64.202.125.16:443, len=517, recv=517 2017-01-06 17:19:23 ERROR: server recv: Connection reset by peer

2017-01-06 17:19:26 ERROR: server recv: Connection reset by peer 2017-01-06 17:19:26 ERROR: server recv: Connection reset by peer 2017-01-06 17:19:26 INFO: redir to 172.217.25.67:443, len=216, recv=216 2017-01-06 17:19:26 ERROR: server recv: Connection reset by peer 2017-01-06 17:19:26 INFO: redir to 172.217.25.67:443, len=494, recv=494 2017-01-06 17:19:26 ERROR: server recv: Connection reset by peer

madeye commented 7 years ago

I think you have some iptables issues in your configuration. In your log, never a remote recv log appears, which means it's not related to ISP or WAN connection, it's just an issue on your router or in your LAN connection.

madeye commented 7 years ago

BTW, server recv: Connection reset by peer means the client connected to your ss-redir closes the connection actively.

abigchopstick commented 7 years ago

Are you saying that some devices on my lan is trying to reset the connection of ss-redir repeatly and nothing related to the WAN?

madeye commented 7 years ago

From the log you provided, the answer is yes. None of your log is related to the connection to the remote shadowsocks server. So double check your iptables rules.

abigchopstick commented 7 years ago

Turns out it was one of my VM win7 machine constantly harassing ss-redir. I am unsure if it was parallel desktop or it was virus on my VM. At any rate it is unrelated to ss-redir.

Sorry for the mistake.

Thanks a lot for your help!

0neday commented 4 years ago

Got same error on ss-libev 3.3.4-1 ! Could be caused by ISP reset connection which is featureless packet at regular intervals, but could be improved by using haproxy to proxy your 2 ss server. image