Open rr-girl opened 2 years ago
We have been seeing sustained attacks from multiple adversaries against onion services, and against the Tor network itself, for the past 3 months. These adversaries are continually adapting to our defenses.
For the past 3 years, we have attempted to get funding for DoS defenses for onion services from various funding sources. Every single one of these funding sources declined to fund onion service development or defenses. Because of the way our funding model works, this proposal-reject churn also delayed development on other tasks. Effectively a DoS attack in and of itself.
During this process, one of the two primary developers on onion services resigned. The other primary onion service developer is currently on leave. In fact, we will only have 2 developers directly devoted to all of C-Tor for the next 6-8 weeks or so, and about half of that developer time will be spent writing and revising other funding proposals during this period. The remainder will be spent performing other already funded work, and related administration tasks.
We do have pending funding for general DoS defenses that should be finalized soon, but that also does not cover onion service development. We will need to show other progress for that funder; any DoS defenses specifically for onion services will still have to be done "on the side".
Normally, I would not air our dirty laundry like this, but since it will have a direct impact on people like you, it seemed important to explain why we're going to be especially unresponsive to issues with onion services for the next while.
I am also not sure exactly what kind of DoS attack you are seeing here is, either. However, one thing that may help is to set the torrc MiddleNodes
directive to a restricted set of specific high speed Tor relays, by relay fingerprint. This will have the effect of reducing the cost of circuit creation for your service, because less relays will need to be searched through when building circuits. It will also impact anonymity though, especially if one of those relays is compromised. However, in combination with OnionBalance, this may help you have some kind of reachability.
MiddleNodes
can be used in combination with the vanguards addon, so that the MiddleNodes
are chosen after the Layer2 and Layer3 Guards. But, while under DoS, you may want to consider either only using MiddleNodes
, or using vanguards with --one_shot_vanguards
option, to reduce CPU. Note that Tor 0.4.7 has a "vanguards-lite" defense of using only Layer2 guards (but not Layer3 Guards), so that will still provide some protection without the vanguards addon. The vanguards-lite rotation times are shorter than vanguards addon, and it does not use Layer3 guards, so less protection is also the tradeoff there.
The other answer is https://gitlab.torproject.org/tpo/core/tor/-/issues/40634, which we have prototyped, but again, staff shortages mean that we will struggle with getting that defense in shippable state for the next couple months.
Sorry for the bad news. Hope this helps.
Sad to read the poor state of the Tor Project. I would contribute if I knew advanced C. But hey! I wanna learn Rust so I can do it in the future!
I think I have a similar problem, I launch TOR, connect Vanguard to it, and also through port 9051 I make a command to ADD_ONION and add 2 domains, after which when I make a request to the site, the site opens, but when I make a second request, the error is output, and after that the tor is not restored...
vanguards_1 | WARNING[Thu Sep 01 13:30:25 2022]: Possible Tor bug, or possible attack if very frequent: Got 1 dropped cell on circ 79 (in state HS_SERVICE_REND HSSR_JOINED; old state HS_SERVICE_REND HSSR_CONNECTING)
tor_1 | Sep 01 13:30:25.000 [warn] Unable to find any hidden service associated identity key 0IOlo0uQJyerxOCkYrVD6h8Nj23zdFsRpRlhjWquO3M on rendezvous circuit 2698458892.
Thank you for taking the time to read this! We are no experts in Tor networking or with Vanguards, but know our way around.
If anything described here makes sense to you and/or you have a suspicion on what is going on, any kind of feedback is extremely appreciated!
Our service was running fine and stable for a long time. Suddenly our service experienced light, then severe performance issues at first, then became completely unreachable.
This does not seem to be a "regular" DDoS attack. It seems to be a DoS attack, but not like anything known, as it seems to be an attack on the Tor daemon alone. All countermeasures we tried so far were not successful. There is no attack noticeable on the actual services running (http, ssh, ftp, mail, etc.) but only on the Tor connection itself.
Even tough we did a deep research and investigation, the event seems to be beyond our comprehension. Documentations about prior attacks on Tor Hidden Services are rare to find and don't make sense to what we are experiencing.
We refuse to believe that there is an attack scenario that is unknown and cannot be counteracted as it would render every single Tor Hidden Service vulnerable to it with adversaries being able to take any Tor Hidden Service offline at will within minutes for an indefinite time. The impact on the Tor Community would be beyond scope.
We hope somebody with the necessary insight and expertise can at least drop us a hint on what exactly is going on and point us in the right direction towards finding a solution to end this attack or reducing it's impact on our systems and to prevent such attacks in the future. This would help protect Tor Hidden Services around the world from future attacks like the one we are experiencing.
torrc configuration:
Less than a minute after starting Tor with the Hidden Service enabled the CPU gets to 100%, memory seems unaffected
Used bandwidth increases around 100kb/s
The Hidden Service descriptor is unreachable
The system can still resolve clearnet and Tor addresses
The system can still connect to outbound services (ex. curl to a website)
The system gets flooded with incoming tcp packets on the loopback interface
When Tor is restarted, the flood ends for a few seconds, then starts again
When Tor is started without the Hidden Service enabled there seems to be no problem
When Tor is started with a second Hidden Service enabled, both Hidden Service Descriptors remain unreachable:
Tor Browser on Primary HS: Onionsite Has Disconnected The most likely cause is that the onionsite is offline. Contact the onionsite administrator. Details: 0xF2 — Introduction failed, which means that the descriptor was found but the service is no longer connected to the introduction point. It is likely that the service has changed its descriptor or that it is not running. or The connection has timed out The server at (attacked hidden service descriptor).onion is taking too long to respond. The site could be temporarily unavailable or too busy. Try again in a few moments.
Tor Browser on Seconday HS: Unable to connect Firefox can’t establish a connection to the server at (secondary hidden service descriptor).onion The site could be temporarily unavailable or too busy. Try again in a few moments.
When Tor is started with a second Hidden Service enabled which is protected by OnionAuthentication, the primary Hidden Service Descriptor remains unreachable, the secondary (protected) Hidden Service Descriptor is reachable.
When Tor is started with only the second Hidden Service enabled (with or without OnionAuthentication) there seems to be no problem
When Tor is started with the primary Hidden Service protected by OnionAuthentication, the Hidden Service Descriptor is reachable
When attacked, these entries appear in the tor log file:
Aug 24 19:42:18.000 [notice] Bootstrapped 100% (done): Done Aug 24 19:42:34.000 [notice] Your network connection speed appears to have changed. Resetting timeout to 60000ms after 18 timeouts and 388 buildtimes. Aug 24 19:42:39.000 [notice] Your network connection speed appears to have changed. Resetting timeout to 60000ms after 18 timeouts and 240 buildtimes. Aug 24 19:42:53.000 [notice] Your network connection speed appears to have changed. Resetting timeout to 60000ms after 18 timeouts and 489 buildtimes. Aug 24 19:43:11.000 [notice] Your network connection speed appears to have changed. Resetting timeout to 60000ms after 18 timeouts and 659 buildtimes. ... Aug 24 19:46:09.000 [notice] Extremely large value for circuit build timeout: 122s. Assuming clock jump. Purpose 14 (Measuring circuit timeout) Aug 24 19:46:09.000 [notice] Extremely large value for circuit build timeout: 122s. Assuming clock jump. Purpose 14 (Measuring circuit timeout) Aug 24 19:46:09.000 [notice] Your network connection speed appears to have changed. Resetting timeout to 60000ms after 18 timeouts and 114 buildtimes. Aug 24 19:46:15.000 [notice] Your network connection speed appears to have changed. Resetting timeout to 60000ms after 18 timeouts and 125 buildtimes. ... Aug 24 19:47:08.000 [notice] Extremely large value for circuit build timeout: 123s. Assuming clock jump. Purpose 14 (Measuring circuit timeout) Aug 24 19:47:10.000 [notice] Extremely large value for circuit build timeout: 122s. Assuming clock jump. Purpose 14 (Measuring circuit timeout) Aug 24 19:47:10.000 [notice] Extremely large value for circuit build timeout: 123s. Assuming clock jump. Purpose 14 (Measuring circuit timeout) Aug 24 19:47:13.000 [notice] Extremely large value for circuit build timeout: 123s. Assuming clock jump. Purpose 14 (Measuring circuit timeout) Aug 24 19:47:14.000 [notice] Your network connection speed appears to have changed. Resetting timeout to 60000ms after 18 timeouts and 495 buildtimes. Aug 24 19:47:18.000 [notice] Your network connection speed appears to have changed. Resetting timeout to 60000ms after 18 timeouts and 124 buildtimes. Aug 24 19:47:21.000 [notice] Extremely large value for circuit build timeout: 122s. Assuming clock jump. Purpose 14 (Measuring circuit timeout) Aug 24 19:47:23.000 [notice] Extremely large value for circuit build timeout: 123s. Assuming clock jump. Purpose 14 (Measuring circuit timeout) ... Aug 24 19:47:55.000 [notice] Your network connection speed appears to have changed. Resetting timeout to 60000ms after 18 timeouts and 1000 buildtimes. Aug 24 19:47:59.000 [notice] Your network connection speed appears to have changed. Resetting timeout to 60000ms after 18 timeouts and 117 buildtimes. ... Aug 24 19:52:43.000 [notice] Strange value for circuit build time: 121581msec. Assuming clock jump. Purpose 14 (Measuring circuit timeout) Aug 24 19:52:43.000 [notice] Your network connection speed appears to have changed. Resetting timeout to 120000ms after 18 timeouts and 57 buildtimes. Aug 24 19:52:53.000 [notice] Interrupt: exiting cleanly.
We set up a completely new server from scratch running only basic OS as well as tor and vanguards in standard configuration to exclude the possibility of a mis-configuration on our affected server. As soon as tor was started with the attacked descriptor the exact same things are happening.
We tried to split the load on our server via OnionBalance in this setup:
Server1: Runs Onionbalance for primary Hidden Service Descriptor Server2/3/4: Runs the Hidden Service on new and different Hidden Service Descriptors
We added these directives to the hidden service block in torrc and tried various settings on them:
HiddenServiceEnableIntroDoSDefense 1 HiddenServiceEnableIntroDoSBurstPerSec
HiddenServiceEnableIntroDoSRatePerSec
This reduced CPU load significantly on servers 2/3/4, but the balanced service descriptor remains unreachable.
We tried changing various settings in vanguards.conf, with no success.
We tried to identify the attacking tcp packets and have them blocked via iptables, with no success. Our expertise is not sufficient to draw ideas of what exctely is happening by inspecting the contents of said tcp packets which look like this:
19:45:27.839934 IP (tos 0x0, ttl 64, id 35746, offset 0, flags [DF], proto TCP (6), length 4100) 127.0.0.1.9051 > 127.0.0.1.46712: Flags [P.], cksum 0x0df9 (incorrect -> 0xe713), seq 1543428574:1543432622, ack 1711981309, win 512, options [nop,nop,TS val 2971851406 ecr 2971851369], length 4048 E.....@.@..O........#[.x[...f ............. ."...".i650 CIRC 9802 EXTENDED $2BCA0A8B5759DBD764BF9FA5D1B3AEE9D74D2B68~Waeswynn,$CED577F091DCB15AD8C87FBD452A51EA9E60BFC2~strayWires,$CC8B218ED3615827A5DCF008FC62598DEF533B4F~mikrogravitation02,$7B46F20449D6F25150E189428B62E1E3BA5848A9~galtlandeu,$BF93594384A02DE7689C4FD821E2638DA2CD4792~labaliseridicule BUILD_FLAGS=IS_INTERNAL,NEED_CAPACITY PURPOSE=HS_SERVICE_REND HS_STATE=HSSR_CONNECTING REND_QUERY=|---------(attacked hidden service descriptor)---------| TIME_CREATED=2022-08-24T19:45:22.060324 650 CIRC 9802 BUILT $2BCA0A8B5759DBD764BF9FA5D1B3AEE9D74D2B68~Waeswynn,$CED577F091DCB15AD8C87FBD452A51EA9E60BFC2~strayWires,$CC8B218ED3615827A5DCF008FC62598DEF533B4F~mikrogravitation02,$7B46F20449D6F25150E189428B62E1E3BA5848A9~galtlandeu,$BF93594384A02DE7689C4FD821E2638DA2CD4792~labaliseridicule BUILD_FLAGS=IS_INTERNAL,NEED_CAPACITY PURPOSE=HS_SERVICE_REND HS_STATE=HSSR_CONNECTING REND_QUERY=|---------(attacked hidden service descriptor)---------| TIME_CREATED=2022-08-24T19:45:22.060324 650 CIRC_MINOR 9802 PURPOSE_CHANGED $2BCA0A8B5759DBD764BF9FA5D1B3AEE9D74D2B68~Waeswynn,$CED577F091DCB15AD8C87FBD452A51EA9E60BFC2~strayWires,$CC8B218ED3615827A5DCF008FC62598DEF533B4F~mikrogravitation02,$7B46F20449D6F25150E189428B62E1E3BA5848A9~galtlandeu,$BF93594384A02DE7689C4FD821E2638DA2CD4792~labaliseridicule BUILD_FLAGS=IS_INTERNAL,NEED_CAPACITY PURPOSE=HS_SERVICE_REND HS_STATE=HSSR_JOINED REND_QUERY=|---------(attacked hidden service descriptor)---------| TIME_CREATED=2022-08-24T19:45:22.060324 OLD_PURPOSE=HS_SERVICE_REND OLD_HS_STATE=HSSR_CONNECTING 650 CIRC 9818 EXTENDED $2BCA0A8B5759DBD764BF9FA5D1B3AEE9D74D2B68~Waeswynn,$CED577F091DCB15AD8C87FBD452A51EA9E60BFC2~strayWires,$CC8B218ED3615827A5DCF008FC62598DEF533B4F~mikrogravitation02,$7A319C431F38CB30A0BC0C49144369A611920725~BahnhufPowah2,$8587A1B4CCD0700F164CCD588F79743C74FE8700~mev4PLicebeer16b BUILD_FLAGS=IS_INTERNAL,NEED_CAPACITY PURPOSE=HS_SERVICE_REND HS_STATE=HSSR_CONNECTING REND_QUERY=|---------(attacked hidden service descriptor)---------| TIME_CREATED=2022-08-24T19:45:22.493699 650 CIRC 9818 BUILT $2BCA0A8B5759DBD764BF9FA5D1B3AEE9D74D2B68~Waeswynn,$CED577F091DCB15AD8C87FBD452A51EA9E60BFC2~strayWires,$CC8B218ED3615827A5DCF008FC62598DEF533B4F~mikrogravitation02,$7A319C431F38CB30A0BC0C49144369A611920725~BahnhufPowah2,$8587A1B4CCD0700F164CCD588F79743C74FE8700~mev4PLicebeer16b BUILD_FLAGS=IS_INTERNAL,NEED_CAPACITY PURPOSE=HS_SERVICE_REND HS_STATE=HSSR_CONNECTING REND_QUERY=|---------(attacked hidden service descriptor)---------| TIME_CREATED=2022-08-24T19:45:22.493699 650 CIRC_MINOR 9818 PURPOSE_CHANGED $2BCA0A8B5759DBD764BF9FA5D1B3AEE9D74D2B68~Waeswynn,$CED577F091DCB15AD8C87FBD452A51EA9E60BFC2~strayWires,$CC8B218ED3615827A5DCF008FC62598DEF533B4F~mikrogravitation02,$7A319C431F38CB30A0BC0C49144369A611920725~BahnhufPowah2,$8587A1B4CCD0700F164CCD588F79743C74FE8700~mev4PLicebeer16b BUILD_FLAGS=IS_INTERNAL,NEED_CAPACITY PURPOSE=HS_SERVICE_REND HS_STATE=HSSR_JOINED REND_QUERY=|---------(attacked hidden service descriptor)---------| TIME_CREATED=2022-08-24T19:45:22.493699 OLD_PURPOSE=HS_SERVICE_REND OLD_HS_STATE=HSSR_CONNECTING 650 CIRC 9997 EXTENDED $2BCA0A8B5759DBD764BF9FA5D1B3AEE9D74D2B68~Waeswynn,$8D896C8B367813030591A00DB7E7722EF6C4C23C~Luxembourg,$FF353F5D011E69ECDA10A57B46D06BC7B3FEB196~fuego,$347253D1D5246CB1C4CF8088C6982FE77CF7AB9C~ph3x,$E84F41FA1D1FA303FD7A99A35E50ACEF4269868C~Quetzalcoatl BUILD_FLAGS=IS_INTERNAL,NEED_CAPACITY PURPOSE=HS_SERVICE_REND HS_STATE=HSSR_CONNECTING REND_QUERY=|---------(attacked hidden service descriptor)---------| TIME_CREATED=2022-08-24T19:45:25.100429 650 CIRC 9997 BUILT $2BCA0A8B5759DBD764BF9FA5D1B3AEE9D74D2B68~Waeswynn,$8D896C8B367813030591A00DB7E7722EF6C4C23C~Luxembourg,$FF353F5D011E69ECDA10A57B46D06BC7B3FEB196~fuego,$347253D1D5246CB1C4CF8088C6982FE77CF7AB9C~ph3x,$E84F41FA1D1FA303FD7A99A35E50ACEF4269868C~Quetzalcoatl BUILD_FLAGS=IS_INTERNAL,NEED_CAPACITY PURPOSE=HS_SERVICE_REND HS_STATE=HSSR_CONNECTING REND_QUERY=|---------(attacked hidden service descriptor)---------| TIME_CREATED=2022-08-24T19:45:25.10
This is with Tor running on a single server. When balanced, the |---------(attacked hidden service descriptor)---------| is replaced by the backends service descriptors of server 2/3/4.
We can make a larger tcpdump snipplet available, if needed.
We think that an adversary has the ability to "interrupt" a Hidden Service Descriptor (and therefore the Hidden Service itself) by flooding the Tor daemon with uncountable tcp packets, requesting to build circuits. This is what causes the CPU load and ultimately renders the Hidden Service Descriptor unusable.
Can anyone confirm this?
Since directives like the said HiddenServiceEnableIntroDoSDefense, HiddenServiceEnableIntroDoSBurstPerSec and HiddenServiceEnableIntroDoSRatePerSec seem to be meant do defend against this sort of attacks, just as vanguards should too, we cannot explain why those remain ineffective. Maybe some very specialized settings of those values are necessary to make them effective. Unfortunately these directives (as well as the settings in vanguards.config) are only described in a vague way.
Does anyone know, how these have to be set correctly to be effective?
At this point we exhaused all referneces on tor and vanguards configuration we could find online.
Again, any help or information on this issue would be greatly appreciated! We do not believe that there is no solution to this.