torproject / stem

Python controller library for Tor
https://stem.torproject.org/
GNU Lesser General Public License v3.0
257 stars 75 forks source link

`can_exit_to` a port returns True when using `ServerDescriptor` and False when using `RouterStatusEntry` #69

Open juga0 opened 3 years ago

juga0 commented 3 years ago

When the exit policy accepts traffic to that port only to a subnet.

For instance, a relay with exit policy:

accept 133.0.0.0/8:443
>>> rs = controller.get_network_status("63BF46A63F9C21FD315CD061B3EAA3EB05283A0A")
>>> sd = controller.get_server_descriptor("63BF46A63F9C21FD315CD061B3EAA3EB05283A0A")
>>> rs.exit_policy.can_exit_to(port=443)
False
>>> sd.exit_policy.can_exit_to(port=443)
True

Why is can_exit_to returning different things?, should it be queried in a different way?, is it possible that this happens because sbws is not fetching microdescriptors.

atagar commented 3 years ago

Hi juga. Tor has two types of exit policy: full and micro.

Tor's full policy is a complete record of what a relay will and won't accept. These only reside within server descriptors and are effectively unused aside from directory authorities to inform what the micro policy should be.

The micro policy is an abbreviated list of ports with a accept *:[port] policy. These are contained within the consensus and microdescritpors, and are used for most path selection in practice. Tor uses micro policies to dramatically lessen the amount of descriptor information tor must download to work.

Stem's can_exit_to() method can answer two different questions based on its strict argument...

  1. Can I exit to ANY destination with port 443?
  2. Can I exit to ALL destinations with port 443?

Here's a demonstration of what the policies actually look like and how the calls differ...

Demo script

import stem.descriptor.remote

fingerprint = '63BF46A63F9C21FD315CD061B3EAA3EB05283A0A'

server_desc = stem.descriptor.remote.get_server_descriptors(fingerprint).run()[0]
consensus_desc = next(filter(lambda desc: desc.fingerprint == fingerprint, stem.descriptor.remote.get_consensus().run()))

print('=' * 80)
print('Server descriptor exit policy')
print('=' * 80)
print('')
print(server_desc.exit_policy)
print('')

print('=' * 80)
print('Consensus exit policy')
print('=' * 80)
print('')
print(consensus_desc.exit_policy)
print('')

print('=' * 80)
print('What class is our server descriptor exit policy?  %s' % type(server_desc.exit_policy).__name__)
print('What class is our consensus exit policy?  %s' % type(consensus_desc.exit_policy).__name__)
print('=' * 80)
print('Can our server descriptor exit to ANY port 443?  %s' % server_desc.exit_policy.can_exit_to(port = 443))
print('Can our consensus descriptor exit to ANY port 443?  %s' % consensus_desc.exit_policy.can_exit_to(port = 443))
print('=' * 80)
print('Can our server descriptor exit to ALL port 443?  %s' % server_desc.exit_policy.can_exit_to(port = 443, strict = True))
print('Can our consensus descriptor exit to ALL port 443?  %s' % consensus_desc.exit_policy.can_exit_to(port = 443, strict = True))
print('=' * 80)

Demo output

% python demo.py
================================================================================
Server descriptor exit policy
================================================================================

reject 0.0.0.0/8:*, reject 169.254.0.0/16:*, reject 127.0.0.0/8:*, reject 192.168.0.0/16:*, reject 10.0.0.0/8:*, reject 172.16.0.0/12:*, reject 5.9.158.75:*, reject 1.1.1.1:53, reject 4.2.2.2:53, reject 8.8.8.8:53, reject 9.9.9.9:53, reject 9.9.9.10:53, reject 108.166.183.219:53, reject 168.156.8.90/24:53, reject 184.105.193.73:53, reject 193.70.85.1:7777, accept 133.0.0.0/8:80, accept 133.0.0.0/8:443, accept *:53, accept *:563, accept *:706, accept *:749, accept *:853, accept *:873, accept *:994, accept *:1194, accept *:1293, accept *:1500, accept *:1533, accept *:1677, accept *:1755, accept *:2083, accept *:5190, accept *:5222, accept *:5223, accept *:5228, accept *:5269, accept *:5280, accept *:6667, accept *:6679, accept *:6697, accept *:7777, accept *:8074, accept *:8232-8233, accept *:8332-8333, accept *:9418, accept *:11371, accept *:19638, accept *:50002, accept *:64738, reject *:*

================================================================================
Consensus exit policy
================================================================================

accept 53,563,706,749,853,873,994,1194,1293,1500,1533,1677,1755,2083,5190,5222-5223,5228,5269,5280,6667,6679,6697,7777,8074,8232-8233,8332-8333,9418,11371,19638,50002,64738

================================================================================
What class is our server descriptor exit policy?  ExitPolicy
What class is our consensus exit policy?  MicroExitPolicy
================================================================================
Can our server descriptor exit to ANY port 443?  True
Can our consensus descriptor exit to ANY port 443?  False
================================================================================
Can our server descriptor exit to ALL port 443?  False
Can our consensus descriptor exit to ALL port 443?  False
================================================================================

We should describe this in our exit policy docs. Keeping this open to track that.

juga0 commented 3 years ago

Thanks @atagar for the detailed explanation. Now i know how we should query the exit policy in sbws.

juga0 commented 3 years ago

Hi @atagar,

I thought I've understood how it works, but I didn't and got into another issue I can't explain. So, let's see... [snip]

The micro policy is an abbreviated list of ports with a *accept :[port]** policy. These are contained within the consensus and microdescritpors, and are used for most path selection in practice. Tor uses micro policies to dramatically lessen the amount of descriptor information tor must download to work.

According to this, is it generally recommended to use the micro policy?, when the full policy would be recommended instead?

Stem's can_exit_to() method can answer two different questions based on its strict argument...

1. Can I exit to **ANY** destination with port 443?

Just a terminology thing: intuitively i'd think that ANY means what seems to be ALL here, and instead i'd think of ANY as SOME

[snip]

Can our server descriptor exit to ANY port 443? True Can our consensus descriptor exit to ANY port 443? False

Still not quite clear to me why the difference here. Because the micro policy is missing info about what is rejected by IP?

Can our server descriptor exit to ALL port 443? False Can our consensus descriptor exit to ALL port 443? False

[snip]

And in this case, it's the same with strict, but, i run into this other issue, in which the micropolicy seems to be the way to go in sbws case, ie: "give me all the exits that can exit to 443 from all IPs":

consensus = controller.get_network_statuses()
consensus_l = list(consensus)
len(consensus_l) # 6386
exits = [r for r in consensus_l if r.exit_policy.can_exit_to(port=443, strict=True)]
len(exits) # 1168
[r for r in exits if r.fingerprint=="63BF46A63F9C21FD315CD061B3EAA3EB05283A0A"] # []
descs = controller.get_server_descriptors()
descs_l = list(descs)
len(descs_l) # 6386
exits = [r for r in descs_l if r.exit_policy.can_exit_to(port=443, strict=True)]
len(exits) # 0
exits = exits + [r for r in descs_l if r.exit_policy_v6.can_exit_to(port=443, strict=True)]
len(exits) # 544
[r for r in exits if r.fingerprint=="63BF46A63F9C21FD315CD061B3EAA3EB05283A0A"] # []

So i had assumed, that with the full policy and asking for ALL, i'd get all the relays that can exit to a port for all IPs, but it does seem to be the case for micro policy instead?

And with descriptors there's also exit_policy_v6. Does the consensus don't have that cause doesn't take into account IPs?

atagar commented 3 years ago

According to this, is it generally recommended to use the micro policy?, when the full policy would be recommended instead?

Hi juga. Tor's grown organically over this last decade and microdescriptors are an artifact of that. Server descriptors date back to the dawn of tor, whereas microdescriptors were added much later and brought with it tradeoffs. Most relevant for our discussion here...

Users that desire to download server descriptors (and by extension use full, authoritative exit policies) can put UseMicrodescriptors 0 in their torrc.

So to answer your question the two policies answer subtly different questions...

Can our server descriptor exit to ANY port 443? True Can our consensus descriptor exit to ANY port 443? False

Still not quite clear to me why the difference here. Because the micro policy is missing info about what is rejected by IP?

Yes. The full policy has accept 133.0.0.0/8:443 whereas the microdescriptor policy does not.

So i had assumed, that with the full policy and asking for ALL, i'd get all the relays that can exit to a port for all IPs, but it does seem to be the case for micro policy instead?

Microdescriptors outnumber the number of server descriptor policies that can exit to ALL port 443s because server descriptors can reject individual IPs. For example, a server descriptor policy of "reject 1.2.3.4:443, accept *:443" would translate into a microdescriptor policy of "accept 443". The former doesn't exit to all port 443s, whereas the later does.

And with descriptors there's also exit_policy_v6. Does the consensus don't have that cause doesn't take into account IPs?

IPv6 has separate exit policies which were appended to consensus documents relatively recently.

juga0 commented 3 years ago

Thanks @atagar for all the explanation.

One more thing, hopefully the last in this ticket. It seems then that for sbws i need a descriptor method that tells me whether an exit can exit to a port from all the public IP addresses.

I was surprised that the line in my example will return 0 exits:

exits = [r for r in consensus_l if r.exit_policy.can_exit_to(port=443, strict=True)]

I found that, for example this exit rejects all the private IP address but not the public ones.

I think it's probably better to implement this in stem since it'd have to go over the full policy. Would you have time to implement it? If not I can try.

All of this come because, to don't have to resolve a web server domain via Tor, sbws choose whether to measure a relay in the exit position or not using the descriptor can_exit_to(port and then, some exits will always fail to be measured.

An alternative would be to always try a second exit if the measurement fails. Well, this would need more explanation but i think you can get an idea.

atagar commented 3 years ago

Hi juga. Stem already has a method to drop private entries...

import stem.descriptor.remote

descriptors = stem.descriptor.remote.get_server_descriptors().run()

exits = [desc for desc in descriptors if desc.exit_policy.can_exit_to(port = 443, strict = True)]
print('%i relays can exit to every port 443' % len(exits))

exits = [desc for desc in descriptors if desc.exit_policy.strip_private().can_exit_to(port = 443, strict = True)]
print('%i relays can exit to every non-private port 443' % len(exits))
% python demo.py
0 relays can exit to every port 443
1085 relays can exit to every non-private port 443
juga0 commented 3 years ago

Hi juga. Stem already has a method to drop private entries...

Awesome!, thanks and sorry I didn't see that.

atagar commented 3 years ago

No worries in the least juga. Exit policies are deceptively confusing, and that method is easy to miss. :P

atagar commented 3 years ago

Oops, stupid me. I wanted to keep this ticket open to expand the exit policy docs - reopening.