torproject / stem

Python controller library for Tor
https://stem.torproject.org/
GNU Lesser General Public License v3.0
257 stars 75 forks source link

tor_tools.is_valid_hidden_service_address doesn't validate address checksum #89

Closed ghost closed 3 years ago

ghost commented 3 years ago

is_valid_hidden_service_address does not properly validate v3 onions according to the rendezvous 3 spec.

All encoded v3 onion addresses are an ed25519 public key followed by a 2 byte sha3_256 checksum and version. This is why all valid v3 .onions end with 'd'.

I would find it useful for this function to validate the checksum so that I can avoid trying connections to invalid user-provided onions. It should probably also validate that it ends in 3 (d) for cases where the short checksum collides.

If the maintainers here agree with me, I can take a crack at it as it should not be hard to calculate the checksum. Python has had sha3 in stdlib since python 3.6, so it's in all officially maintained cpython versions.


   The onion address of a hidden service includes its identity public key, a
   version field and a basic checksum. All this information is then base32
   encoded as shown below:

     onion_address = base32(PUBKEY | CHECKSUM | VERSION) + ".onion"
     CHECKSUM = H(".onion checksum" | PUBKEY | VERSION)[:2]

     where:
       - PUBKEY is the 32 bytes ed25519 master pubkey of the hidden service.
       - VERSION is an one byte version field (default value '\x03')
       - ".onion checksum" is a constant string
       - CHECKSUM is truncated to two bytes before inserting it in onion_address

(Where H = sha3_256 according to the spec)

atagar commented 3 years ago

Hi Kevin, I'd be delighted for improvements to is_valid_hidden_service_address. Presently it's nothing more than a simple regex check...

# Hidden service addresses are sixteen or fifty six base32 characters.

HS_V2_ADDRESS_PATTERN = re.compile('^[a-z2-7]{16}$')
HS_V3_ADDRESS_PATTERN = re.compile('^[a-z2-7]{56}$')

My understanding is that you plan to base32 decode the input, then somehow check that the CHECKSUM bytes are a valid sha3_256 hash - is that correct?

ghost commented 3 years ago

Yes, I'd have it calculate the checksum in the way outlined in the spec then check that the ending two bytes match the ones in the string.

atagar commented 3 years ago

The trouble as I see it is that the PUBKEY and VERSION aren't input arguments of is_valid_hidden_service_address. All you have is the output CHECKSUM. As such I don't think we can do what you're proposing, but I'd be happy to see a patch.

ghost commented 3 years ago

VERSION is an optional argument (it checks both 2 and 3 right now if not specified). If v3 address, it is supposed to be part of the address (the string we are checking), as is the ed25519 public key.

ghost commented 3 years ago

I'm working on a patch now, i'll adjust/include a test for it.

ghost commented 3 years ago

Here's my patch with the validation and test, if needed i can make a PR

https://git.voidnet.tech/kev/stem/commit/c0cb1560cd61b9f3ba9020fc692f2a4b46f65ff3.patch

atagar commented 3 years ago

Thanks Kevin! Pushed with some minor adjustments.