SemiConscious commented 2 years ago

Description

This is more a basis for discussion than a serious proposal to merge. I'm happy to take suggestions, I am not a tor expert so there may be much that can be improved, eg using some other existing tor proxy setup, or not using one at all.

The goal is to obfuscate the origin IPs and keep them changing, and to create a simple way of creating a db1000n installation with proxy, in AWS. It's immediate benefit is that it seems to work, as far as I can tell.

Notes:

var.name is changed as NLBs don't support underscores in their names
the db1000n servers will need to be restarted if the NLB is recreated. Is it possible to change the db1000n proxy config without restarting?
there is a var.zones variable that allows you to configure the number of availability zones - defaults to 2. LBs require >1
the tor instances have a netcat-based healthcheck. There's probably a better way
the tor instances have a cron job that calls HUP on tor processes every min, to get a new IP
works with the newly defaulted arm instances

Update after discussion with @Arriven

The proxy is now optional, configured using the boolean var.enable_tor_proxy. This defaults to false, as with all the related functionality I have added. Doing terraform apply with an earlier tfvars file will get the same results as previously, with the exception that the subnets are arranged slightly differently (with 2 availability zones).

You can switch the enable_tor_proxy setting on existing infrastructure, but the db1000n instances will need to be restarted as the db1000n docker invocation is different and the instances won't automatically restart.

I have changed the type of change to 'non breaking', as doing applying the new code from nothing will produce the same results as before. But running apply on existing infrastructure will change a few things, so I recommend destroy then apply.

Final note: I have made world port 22 visibility optional, as I personally prefer to go in via the EC2 serial connection in the console, which works for instances both public and private subnets, and I am a bit uncomfortable exposing any port to the outside world I don't have to. By default, port 22 is world visible, as before.

Type of change

Please delete options that are not relevant.

[ x] Non-breaking change (new functionality added but default behaviour is as-before)

How Has This Been Tested?

Please describe the tests that you ran to verify your changes. Provide instructions so we can reproduce. Please also list any relevant details for your test configuration

tests were carried out with the ireland variable file

[x] Test A: apply/delete/apply - no problems with creating or deleting
[x] Test B: 24hr soak test in eu workspace
[x] Test C: apply + enable/disable the proxy

Test Configuration

Release version: git hash 1df67b9 (make funlen a little happier)
Platform: default

Logs

docker logs <container> | grep Response
 [ Response rate ]  0.0%
 [ Response rate ]  9.1%
 [ Response rate ] 14.3%
 [ Response rate ] 27.5%
 [ Response rate ] 41.7%
 [ Response rate ] 42.9%
 [ Response rate ] 47.3%
 [ Response rate ] 42.1%
 [ Response rate ] 44.0%
 [ Response rate ] 46.8%
 [ Response rate ] 49.7%
 [ Response rate ] 49.6%
 [ Response rate ] 53.3%
 [ Response rate ] 52.8%
 [ Response rate ] 53.1%
 [ Response rate ] 51.8%
 [ Response rate ] 56.0%
 [ Response rate ] 54.0%
 [ Response rate ] 54.3%
 [ Response rate ] 51.8%
 [ Response rate ] 52.9%
 [ Response rate ] 51.3%
 [ Response rate ] 51.9%
 [ Response rate ] 54.7%
 [ Response rate ] 53.6%
 [ Response rate ] 53.7%
...

Screenshots

arriven commented 2 years ago

Looks pretty good to me, someone recently told me that they had issues with tor proxy, can you confirm that it works for you? I was going to start looking into possible causes tomorrow but knowing that it works for you would allow to assume that the problem is in the configuration rather than in code and save some time

the db1000n servers will need to be restarted if the NLB is recreated. Is it possible to change the db1000n proxy config without restarting?

It is possible but would require some changes to the default config (or use of a custom one) - basically you'd want to embed each job that uses a proxy into an infinite loop job (hope I didn't forget to add it) and make an internal one to only perform a fixed number of requests - it would then end the job but the outer loop would start it again reevaluating the proxylist template (potentially need to add some additional templates depending on how exactly you plan to pass this proxy to a program). Although I thought that tor changes the circuit without the need to recreate the proxy, am I missing something here?

SemiConscious commented 2 years ago

@Arriven thanks for the prompt reply. Tor seems to be working well for me both locally and in AWS. The NLB would only get recreated when something significant changes like changing the number of availability zones you load balance across, or a decision to change something else fundamental. I don't think it's necessary to change things in db1000n - there must be many other higher priorities! One thing I'm not sure of is how many existing AWS db1000n terraformers would welcome this change... I don't want to mess with anybody's existing/working setup. If there's something specific with tor that's creating a problem, I'm more than happy to take a look for you if that would be helpful. Is there a ticket for it?

SemiConscious commented 2 years ago

@Arriven - kudos for a bloody amazing tool by the way :-D

arriven commented 2 years ago

I'll try to find where that issue was mentioned

As for disrupting current workflows for people: there's always an option to have both old and new config in two different folders although I'd prefer having some switches to enable/disable tor NLB within a single terraform deployment

SemiConscious commented 2 years ago

I'd prefer having some switches to enable/disable tor NLB within a single terraform deployment

I did think about that. Leave it with me

SemiConscious commented 2 years ago

I have checked in a module-ised version of the tor proxy, and I have set the defaults to no-proxy.

Incidentally - I am seeing errors in my tor proxy I didn't see yesterday:

Mar 25 12:46:05 ip-10-0-0-136 Tor[1359]: We tried for 15 seconds to connect to '[scrubbed]' using exit $ at x.x.x.x. Retrying on a new circuit.

I am wondering if there is some clever auto-blacklisting going on at the other end, or some of the tor relays are off or disabled in some way - hopefully the network will notice and remove them from the pool.

I am honestly not sure how to tell how well this is working beyond tailing the Response rate in the logs. One difference between the Response rate on a proxy and a non-proxy build is that on the proxy builds the Response rate rises comparatively slowly before stabilising. My current response rate is in the high 60%s

jdoe7865623 commented 2 years ago

@SemiConscious I've done some testing with tor as well (except I've build the tor service directly into the docker image). Unfortunately, I observe a significant reduction in the number of accepted sessions. The tor service log shows the same output as yours, constantly changing exits with little success. Have you looked at the generated traffic stats in comparison to going through a VPN or directly? I'm afraid the response rate as it stands right now is not an adequate measure of attack success. You can stay at 60% forever if your traffic is not accepted at all (which is what I observe with multiple targets).

SemiConscious commented 2 years ago

I agree - it's not a complete solution, and it was particularly bad on Friday - I have 5 proxy instances which were running at 100% CPU capacity and a very poor network throughput. It seems VERY config-dependent. I noticed on friday that there was a lot of targeting of non port 80/443 sockets, eg 5060 (sip) and Tor doesn't support UDP, so Tor spends a lot of time trying to resolve unresolvable failures. Here's how my cluster is currently looking, which is a bit better:

[image: image.png]

I get, obviously, much higher performance if I run without the proxy ... but my assumption so far is that this is misleading as the attack will almost certainly be rejected by the target as it's from an unsafe country. The consequence of that, I thought (maybe), was that fewer higher quality results, even a lot fewer, might result in more damage at the other end. What's your view on that? We need to keep the servers at the other end working, hold open connections ... just hammering away with known-bad IPs won't be doing much IMHO.

One other thing - frustratingly and idiotically I left a key config line for tor out of the pull request. You need to add this line

echo "ExitNodes {ru},{pl},{by},{md},{cz} StrictNodes 1" >> /etc/tor/torrc

My next revision will include a way to set this as a variable. but it may be moot if it turns out that Tor is not the right solution for db1000n.

I am actually working right now on an OpenVPN version, I'm looking at Windscribe initially as a provider (it will probably need a pro account) - I'll do a pull request on that when I get it working properly and I'd appreciate your feedback on how it compares.

By the way I have been trying to see if I could utilise AWS edge data centers somehow as the IP origin for the attacks (using Global Accelerator), but I cannot at the moment see how to route outbound traffic through edge locations - it can only be set up to allow inbound traffic. There might be another way to do that that I am not aware of and I will keep looking - making use of edge locations in Bulgaria or Serbia might mean unblocked traffic without a proxy, so it would be a nice win! Though you'd need to keep recycling IP addresses :P

On Sun, 27 Mar 2022 at 13:08, jdoe7865623 @.***> wrote:

@SemiConscious https://github.com/SemiConscious I've done some testing with tor as well (except I've build the tor service directly into the docker image). Unfortunately, I observe a significant reduction in the number of accepted sessions. The tor service log shows the same output as yours, constantly changing exits with little success. Have you looked at the generated traffic stats in comparison to going through a VPN or directly? I'm afraid the response rate as it stands right now is not an adequate measure of attack success. You can stay at 60% forever if your traffic is not accepted at all (which is what I observe with multiple targets).

— Reply to this email directly, view it on GitHub https://github.com/Arriven/db1000n/pull/406#issuecomment-1079917112, or unsubscribe https://github.com/notifications/unsubscribe-auth/AANMOPX6LOOUF72FAOVCOQ3VCBMVTANCNFSM5RRBBPHA . You are receiving this because you were mentioned.Message ID: @.***>

SemiConscious commented 2 years ago

By the way - I did consider putting Tor into the db1000n container. It would work great for those not running in the cloud. But for the cloud, I felt that it was likely that the resource needs of the proxy wouldn't scale with the resource needs of db1000n and my plan eventually is to use autoscaling to grow (whatever) proxy containers we end up using, and that if proxies are resource hungry we could just scale them out to get the throughput we want. The downside of course is that load balancers aren't free. But nor is freedom!

We could code it not to require load balancers but we'd need lambdas which fire on instance creation and destruction, and restart the app containers, or we could use service discovery via Route53 which again costs, and I am not sure how db1000n treats proxy DNS names internally - if it resolves once at startup then we'd need to restart the app containers again when proxies are added/removed which is not ideal. That's FYI why I put a load balancer in - there will only be 2 ip addresses, one for each availability zone.

On Sun, 27 Mar 2022 at 13:08, jdoe7865623 @.***> wrote:

@SemiConscious https://github.com/SemiConscious I've done some testing with tor as well (except I've build the tor service directly into the docker image). Unfortunately, I observe a significant reduction in the number of accepted sessions. The tor service log shows the same output as yours, constantly changing exits with little success. Have you looked at the generated traffic stats in comparison to going through a VPN or directly? I'm afraid the response rate as it stands right now is not an adequate measure of attack success. You can stay at 60% forever if your traffic is not accepted at all (which is what I observe with multiple targets).

— Reply to this email directly, view it on GitHub https://github.com/Arriven/db1000n/pull/406#issuecomment-1079917112, or unsubscribe https://github.com/notifications/unsubscribe-auth/AANMOPX6LOOUF72FAOVCOQ3VCBMVTANCNFSM5RRBBPHA . You are receiving this because you were mentioned.Message ID: @.***>

SemiConscious commented 2 years ago

@Arriven @jdoe7865623 I have tried a lot of different things today. My main aim was to develop an openvpn solution to replace tor, which I have checked in, though it looks as if I'll need a new pull request.

The headline result is that it doesn't seem to make much difference, though with a 'pro' windscribe account using IP addresses in St Petersburg I'm not seeing any errors from openvpn. openvpn is running as a container alongside db1000n, and the db1000n container uses the openvpn as a proxy using the docker `-network=container:xxx' switch. So no load balancer and db1000n runs without a -proxy setting.

I also tried with more CPU, memory and network bandwidth. Very little difference I could see, so it's not being blocked by capacity our end.

Running with neither openvpn nor tor, however, did make a difference, in the sense that connections cycled more quickly, and there was a LOT more incoming data to the application, like an order of magnitude. But IMHO this supports my theory that the setup with the proxy (tor or openvpn) is actually doing more damage to the target than not using the proxy, even though the performance appears better without the proxy. And held-open connections hopefully means that the other end is processing them very slowly, which is good news, right?

Here's output for ss -s, with the proxy:

proxy

and without:

noproxy

My interpretation of this data is that we are holding many more connections open when using the proxy than when we are not, thus doing more damage. But ... maybe that's wishful thinking, and maybe it's just as good to make many more connections even if they are trivially rejected at the other end due to IP geolocation. I honestly don't know ... but my gut feel is that firewall connections and application sockets are a more valuable resource than network bandwidth.

Question: how should we be measuring how well we are doing? Question: how to increase the number of concurrent connections db1000n is making?

arriven / db1000n

[DRAFT] Add load balanced tor proxy to AWS terraform config #406

Description

Update after discussion with @Arriven

Type of change

How Has This Been Tested?

Test Configuration

Logs

Screenshots