mikeperry-tor / vanguards

Vanguards help guard you from getting vanned...
MIT License
202 stars 23 forks source link

Vanguards not effective - distress call #91

Open rr-girl opened 2 years ago

rr-girl commented 2 years ago

Thank you for taking the time to read this! We are no experts in Tor networking or with Vanguards, but know our way around.

If anything described here makes sense to you and/or you have a suspicion on what is going on, any kind of feedback is extremely appreciated!

  1. What's happening?

Our service was running fine and stable for a long time. Suddenly our service experienced light, then severe performance issues at first, then became completely unreachable.

This does not seem to be a "regular" DDoS attack. It seems to be a DoS attack, but not like anything known, as it seems to be an attack on the Tor daemon alone. All countermeasures we tried so far were not successful. There is no attack noticeable on the actual services running (http, ssh, ftp, mail, etc.) but only on the Tor connection itself.

Even tough we did a deep research and investigation, the event seems to be beyond our comprehension. Documentations about prior attacks on Tor Hidden Services are rare to find and don't make sense to what we are experiencing.

We refuse to believe that there is an attack scenario that is unknown and cannot be counteracted as it would render every single Tor Hidden Service vulnerable to it with adversaries being able to take any Tor Hidden Service offline at will within minutes for an indefinite time. The impact on the Tor Community would be beyond scope.

We hope somebody with the necessary insight and expertise can at least drop us a hint on what exactly is going on and point us in the right direction towards finding a solution to end this attack or reducing it's impact on our systems and to prevent such attacks in the future. This would help protect Tor Hidden Services around the world from future attacks like the one we are experiencing.

  1. What is the setup?

torrc configuration:

CookieAuthentication 1
HashedControlPassword 16:<hash>
VirtualAddrNetworkIPv4 10.192.0.0/10
AutomapHostsOnResolve 1
TransPort 127.0.0.1:9040
DNSPort 127.0.0.1:53
HiddenServiceDir /var/lib/tor/hidden_service
HiddenServicePort 80 127.0.0.1:80
HiddenServiceVersion 3
HiddenServiceAllowUnknownPorts 0
  1. What are the symptoms?

Aug 24 19:42:18.000 [notice] Bootstrapped 100% (done): Done Aug 24 19:42:34.000 [notice] Your network connection speed appears to have changed. Resetting timeout to 60000ms after 18 timeouts and 388 buildtimes. Aug 24 19:42:39.000 [notice] Your network connection speed appears to have changed. Resetting timeout to 60000ms after 18 timeouts and 240 buildtimes. Aug 24 19:42:53.000 [notice] Your network connection speed appears to have changed. Resetting timeout to 60000ms after 18 timeouts and 489 buildtimes. Aug 24 19:43:11.000 [notice] Your network connection speed appears to have changed. Resetting timeout to 60000ms after 18 timeouts and 659 buildtimes. ... Aug 24 19:46:09.000 [notice] Extremely large value for circuit build timeout: 122s. Assuming clock jump. Purpose 14 (Measuring circuit timeout) Aug 24 19:46:09.000 [notice] Extremely large value for circuit build timeout: 122s. Assuming clock jump. Purpose 14 (Measuring circuit timeout) Aug 24 19:46:09.000 [notice] Your network connection speed appears to have changed. Resetting timeout to 60000ms after 18 timeouts and 114 buildtimes. Aug 24 19:46:15.000 [notice] Your network connection speed appears to have changed. Resetting timeout to 60000ms after 18 timeouts and 125 buildtimes. ... Aug 24 19:47:08.000 [notice] Extremely large value for circuit build timeout: 123s. Assuming clock jump. Purpose 14 (Measuring circuit timeout) Aug 24 19:47:10.000 [notice] Extremely large value for circuit build timeout: 122s. Assuming clock jump. Purpose 14 (Measuring circuit timeout) Aug 24 19:47:10.000 [notice] Extremely large value for circuit build timeout: 123s. Assuming clock jump. Purpose 14 (Measuring circuit timeout) Aug 24 19:47:13.000 [notice] Extremely large value for circuit build timeout: 123s. Assuming clock jump. Purpose 14 (Measuring circuit timeout) Aug 24 19:47:14.000 [notice] Your network connection speed appears to have changed. Resetting timeout to 60000ms after 18 timeouts and 495 buildtimes. Aug 24 19:47:18.000 [notice] Your network connection speed appears to have changed. Resetting timeout to 60000ms after 18 timeouts and 124 buildtimes. Aug 24 19:47:21.000 [notice] Extremely large value for circuit build timeout: 122s. Assuming clock jump. Purpose 14 (Measuring circuit timeout) Aug 24 19:47:23.000 [notice] Extremely large value for circuit build timeout: 123s. Assuming clock jump. Purpose 14 (Measuring circuit timeout) ... Aug 24 19:47:55.000 [notice] Your network connection speed appears to have changed. Resetting timeout to 60000ms after 18 timeouts and 1000 buildtimes. Aug 24 19:47:59.000 [notice] Your network connection speed appears to have changed. Resetting timeout to 60000ms after 18 timeouts and 117 buildtimes. ... Aug 24 19:52:43.000 [notice] Strange value for circuit build time: 121581msec. Assuming clock jump. Purpose 14 (Measuring circuit timeout) Aug 24 19:52:43.000 [notice] Your network connection speed appears to have changed. Resetting timeout to 120000ms after 18 timeouts and 57 buildtimes. Aug 24 19:52:53.000 [notice] Interrupt: exiting cleanly.

  1. What was tried to resolve the issue?

19:45:27.839934 IP (tos 0x0, ttl 64, id 35746, offset 0, flags [DF], proto TCP (6), length 4100) 127.0.0.1.9051 > 127.0.0.1.46712: Flags [P.], cksum 0x0df9 (incorrect -> 0xe713), seq 1543428574:1543432622, ack 1711981309, win 512, options [nop,nop,TS val 2971851406 ecr 2971851369], length 4048 E.....@.@..O........#[.x[...f ............. ."...".i650 CIRC 9802 EXTENDED $2BCA0A8B5759DBD764BF9FA5D1B3AEE9D74D2B68~Waeswynn,$CED577F091DCB15AD8C87FBD452A51EA9E60BFC2~strayWires,$CC8B218ED3615827A5DCF008FC62598DEF533B4F~mikrogravitation02,$7B46F20449D6F25150E189428B62E1E3BA5848A9~galtlandeu,$BF93594384A02DE7689C4FD821E2638DA2CD4792~labaliseridicule BUILD_FLAGS=IS_INTERNAL,NEED_CAPACITY PURPOSE=HS_SERVICE_REND HS_STATE=HSSR_CONNECTING REND_QUERY=|---------(attacked hidden service descriptor)---------| TIME_CREATED=2022-08-24T19:45:22.060324 650 CIRC 9802 BUILT $2BCA0A8B5759DBD764BF9FA5D1B3AEE9D74D2B68~Waeswynn,$CED577F091DCB15AD8C87FBD452A51EA9E60BFC2~strayWires,$CC8B218ED3615827A5DCF008FC62598DEF533B4F~mikrogravitation02,$7B46F20449D6F25150E189428B62E1E3BA5848A9~galtlandeu,$BF93594384A02DE7689C4FD821E2638DA2CD4792~labaliseridicule BUILD_FLAGS=IS_INTERNAL,NEED_CAPACITY PURPOSE=HS_SERVICE_REND HS_STATE=HSSR_CONNECTING REND_QUERY=|---------(attacked hidden service descriptor)---------| TIME_CREATED=2022-08-24T19:45:22.060324 650 CIRC_MINOR 9802 PURPOSE_CHANGED $2BCA0A8B5759DBD764BF9FA5D1B3AEE9D74D2B68~Waeswynn,$CED577F091DCB15AD8C87FBD452A51EA9E60BFC2~strayWires,$CC8B218ED3615827A5DCF008FC62598DEF533B4F~mikrogravitation02,$7B46F20449D6F25150E189428B62E1E3BA5848A9~galtlandeu,$BF93594384A02DE7689C4FD821E2638DA2CD4792~labaliseridicule BUILD_FLAGS=IS_INTERNAL,NEED_CAPACITY PURPOSE=HS_SERVICE_REND HS_STATE=HSSR_JOINED REND_QUERY=|---------(attacked hidden service descriptor)---------| TIME_CREATED=2022-08-24T19:45:22.060324 OLD_PURPOSE=HS_SERVICE_REND OLD_HS_STATE=HSSR_CONNECTING 650 CIRC 9818 EXTENDED $2BCA0A8B5759DBD764BF9FA5D1B3AEE9D74D2B68~Waeswynn,$CED577F091DCB15AD8C87FBD452A51EA9E60BFC2~strayWires,$CC8B218ED3615827A5DCF008FC62598DEF533B4F~mikrogravitation02,$7A319C431F38CB30A0BC0C49144369A611920725~BahnhufPowah2,$8587A1B4CCD0700F164CCD588F79743C74FE8700~mev4PLicebeer16b BUILD_FLAGS=IS_INTERNAL,NEED_CAPACITY PURPOSE=HS_SERVICE_REND HS_STATE=HSSR_CONNECTING REND_QUERY=|---------(attacked hidden service descriptor)---------| TIME_CREATED=2022-08-24T19:45:22.493699 650 CIRC 9818 BUILT $2BCA0A8B5759DBD764BF9FA5D1B3AEE9D74D2B68~Waeswynn,$CED577F091DCB15AD8C87FBD452A51EA9E60BFC2~strayWires,$CC8B218ED3615827A5DCF008FC62598DEF533B4F~mikrogravitation02,$7A319C431F38CB30A0BC0C49144369A611920725~BahnhufPowah2,$8587A1B4CCD0700F164CCD588F79743C74FE8700~mev4PLicebeer16b BUILD_FLAGS=IS_INTERNAL,NEED_CAPACITY PURPOSE=HS_SERVICE_REND HS_STATE=HSSR_CONNECTING REND_QUERY=|---------(attacked hidden service descriptor)---------| TIME_CREATED=2022-08-24T19:45:22.493699 650 CIRC_MINOR 9818 PURPOSE_CHANGED $2BCA0A8B5759DBD764BF9FA5D1B3AEE9D74D2B68~Waeswynn,$CED577F091DCB15AD8C87FBD452A51EA9E60BFC2~strayWires,$CC8B218ED3615827A5DCF008FC62598DEF533B4F~mikrogravitation02,$7A319C431F38CB30A0BC0C49144369A611920725~BahnhufPowah2,$8587A1B4CCD0700F164CCD588F79743C74FE8700~mev4PLicebeer16b BUILD_FLAGS=IS_INTERNAL,NEED_CAPACITY PURPOSE=HS_SERVICE_REND HS_STATE=HSSR_JOINED REND_QUERY=|---------(attacked hidden service descriptor)---------| TIME_CREATED=2022-08-24T19:45:22.493699 OLD_PURPOSE=HS_SERVICE_REND OLD_HS_STATE=HSSR_CONNECTING 650 CIRC 9997 EXTENDED $2BCA0A8B5759DBD764BF9FA5D1B3AEE9D74D2B68~Waeswynn,$8D896C8B367813030591A00DB7E7722EF6C4C23C~Luxembourg,$FF353F5D011E69ECDA10A57B46D06BC7B3FEB196~fuego,$347253D1D5246CB1C4CF8088C6982FE77CF7AB9C~ph3x,$E84F41FA1D1FA303FD7A99A35E50ACEF4269868C~Quetzalcoatl BUILD_FLAGS=IS_INTERNAL,NEED_CAPACITY PURPOSE=HS_SERVICE_REND HS_STATE=HSSR_CONNECTING REND_QUERY=|---------(attacked hidden service descriptor)---------| TIME_CREATED=2022-08-24T19:45:25.100429 650 CIRC 9997 BUILT $2BCA0A8B5759DBD764BF9FA5D1B3AEE9D74D2B68~Waeswynn,$8D896C8B367813030591A00DB7E7722EF6C4C23C~Luxembourg,$FF353F5D011E69ECDA10A57B46D06BC7B3FEB196~fuego,$347253D1D5246CB1C4CF8088C6982FE77CF7AB9C~ph3x,$E84F41FA1D1FA303FD7A99A35E50ACEF4269868C~Quetzalcoatl BUILD_FLAGS=IS_INTERNAL,NEED_CAPACITY PURPOSE=HS_SERVICE_REND HS_STATE=HSSR_CONNECTING REND_QUERY=|---------(attacked hidden service descriptor)---------| TIME_CREATED=2022-08-24T19:45:25.10

This is with Tor running on a single server. When balanced, the |---------(attacked hidden service descriptor)---------| is replaced by the backends service descriptors of server 2/3/4.

We can make a larger tcpdump snipplet available, if needed.

  1. Conclusions

We think that an adversary has the ability to "interrupt" a Hidden Service Descriptor (and therefore the Hidden Service itself) by flooding the Tor daemon with uncountable tcp packets, requesting to build circuits. This is what causes the CPU load and ultimately renders the Hidden Service Descriptor unusable.

Can anyone confirm this?

Since directives like the said HiddenServiceEnableIntroDoSDefense, HiddenServiceEnableIntroDoSBurstPerSec and HiddenServiceEnableIntroDoSRatePerSec seem to be meant do defend against this sort of attacks, just as vanguards should too, we cannot explain why those remain ineffective. Maybe some very specialized settings of those values are necessary to make them effective. Unfortunately these directives (as well as the settings in vanguards.config) are only described in a vague way.

Does anyone know, how these have to be set correctly to be effective?

At this point we exhaused all referneces on tor and vanguards configuration we could find online.

Again, any help or information on this issue would be greatly appreciated! We do not believe that there is no solution to this.

mikeperry-tor commented 2 years ago

We have been seeing sustained attacks from multiple adversaries against onion services, and against the Tor network itself, for the past 3 months. These adversaries are continually adapting to our defenses.

For the past 3 years, we have attempted to get funding for DoS defenses for onion services from various funding sources. Every single one of these funding sources declined to fund onion service development or defenses. Because of the way our funding model works, this proposal-reject churn also delayed development on other tasks. Effectively a DoS attack in and of itself.

During this process, one of the two primary developers on onion services resigned. The other primary onion service developer is currently on leave. In fact, we will only have 2 developers directly devoted to all of C-Tor for the next 6-8 weeks or so, and about half of that developer time will be spent writing and revising other funding proposals during this period. The remainder will be spent performing other already funded work, and related administration tasks.

We do have pending funding for general DoS defenses that should be finalized soon, but that also does not cover onion service development. We will need to show other progress for that funder; any DoS defenses specifically for onion services will still have to be done "on the side".

Normally, I would not air our dirty laundry like this, but since it will have a direct impact on people like you, it seemed important to explain why we're going to be especially unresponsive to issues with onion services for the next while.

I am also not sure exactly what kind of DoS attack you are seeing here is, either. However, one thing that may help is to set the torrc MiddleNodes directive to a restricted set of specific high speed Tor relays, by relay fingerprint. This will have the effect of reducing the cost of circuit creation for your service, because less relays will need to be searched through when building circuits. It will also impact anonymity though, especially if one of those relays is compromised. However, in combination with OnionBalance, this may help you have some kind of reachability.

MiddleNodes can be used in combination with the vanguards addon, so that the MiddleNodes are chosen after the Layer2 and Layer3 Guards. But, while under DoS, you may want to consider either only using MiddleNodes, or using vanguards with --one_shot_vanguards option, to reduce CPU. Note that Tor 0.4.7 has a "vanguards-lite" defense of using only Layer2 guards (but not Layer3 Guards), so that will still provide some protection without the vanguards addon. The vanguards-lite rotation times are shorter than vanguards addon, and it does not use Layer3 guards, so less protection is also the tradeoff there.

The other answer is https://gitlab.torproject.org/tpo/core/tor/-/issues/40634, which we have prototyped, but again, staff shortages mean that we will struggle with getting that defense in shippable state for the next couple months.

Sorry for the bad news. Hope this helps.

cypherbits commented 2 years ago

Sad to read the poor state of the Tor Project. I would contribute if I knew advanced C. But hey! I wanna learn Rust so I can do it in the future!

artemsiberiangit commented 2 years ago

I think I have a similar problem, I launch TOR, connect Vanguard to it, and also through port 9051 I make a command to ADD_ONION and add 2 domains, after which when I make a request to the site, the site opens, but when I make a second request, the error is output, and after that the tor is not restored...

vanguards_1  | WARNING[Thu Sep 01 13:30:25 2022]: Possible Tor bug, or possible attack if very frequent: Got 1 dropped cell on circ 79 (in state HS_SERVICE_REND HSSR_JOINED; old state HS_SERVICE_REND HSSR_CONNECTING)
tor_1        | Sep 01 13:30:25.000 [warn] Unable to find any hidden service associated identity key 0IOlo0uQJyerxOCkYrVD6h8Nj23zdFsRpRlhjWquO3M on rendezvous circuit 2698458892.