WIP: Validate Inputs - Githubissues

namecoin / electrum-nmc

Namecoin port of Electrum Bitcoin client.

https://www.namecoin.org/

MIT License

29 stars 24 forks source link

WIP: Validate Inputs #338

Closed robertmin1 closed 5 months ago

JeremyRand commented 1 year ago

What's the origin of the regex patterns used in this PR? (Knowing this would make auditing a lot easier.)

JeremyRand commented 1 year ago

For validation that isn't reliant on Qt, I'd prefer that the functions be added to names.py rather than a gui subfolder; this way they can be reused in the CLI and the Kivy GUI later.

JeremyRand commented 1 year ago

I think I'd prefer that the validation functions raise an Exception on rejection rather than returning a bool. Seems harder to misuse.

robertmin1 commented 1 year ago

What's the origin of the regex patterns used in this PR? (Knowing this would make auditing a lot easier.)

Most of the patterns are generated from regex-generator.olafneumann.org

robertmin1 commented 1 year ago

I think I'd prefer that the validation functions raise an Exception on rejection rather than returning a bool. Seems harder to misuse. For validation that isn't reliant on Qt, I'd prefer that the functions be added to names.py rather than a gui subfolder; this way they can be reused in the CLI and the Kivy GUI later.

Alright! I will make the changes.

JeremyRand commented 1 year ago

Most of the patterns are generated from regex-generator.olafneumann.org

I think for most of these, we should be able to find more canonical validation mechanisms. For example, Electrum deals with IP addresses already, presumably there should be some library function available to validate an IP address? Similarly, ZeroNet addresses are P2PKH Bitcoin addresses; presumably we already have a library function to validate those? For cases where we don't already have a function available in an existing dependency, it might be preferable to validate according to the spec (e.g. a Tor spec for onion service addresses), with each rule in the spec corresponding to one line of Python code. This would make it a lot easier to audit. Alternatively, if there's a high-quality library available (e.g. Stem for onion addresses), we could maybe add that as a dependency, or just copy a Python function from them. I think in the case of onion addresses, probably there's something in Stem that we could copy verbatim rather than adding Stem as a full dependency.

robertmin1 commented 1 year ago

A longer version of the current func for validating ZeroNet addresses. (Comapring cheksum)

def validate_p2pkh_address(address):
    # Step 1: Check the length
    if not (27 <= len(address) <= 36):
        return False

    try:
        # Step 2: Base58 decoding
        decoded_address = base58.b58decode(address)

        # Step 3: Check the version byte
        if decoded_address[0] != 0x00:
            return False

        # Step 4: Double SHA-256 checksum
        checksum = hashlib.sha256(hashlib.sha256(decoded_address[:-4]).digest()).digest()

        # Step 5: Compare checksum
        if decoded_address[-4:] == checksum[:4]:
            return True

    except base58.Base58Error:
        return False

    return False

robertmin1 commented 1 year ago

The current idea for IP2 is something similar to this.

robertmin1 commented 1 year ago

The current idea on validating onion addresses, to mitigate using stem module. From the suggested Rust Code.

import base64
import hashlib
V3_ONION_SERVICE_ID_RAW_SIZE= 35
V3_ONION_SERVICE_ID_VERSION_OFFSET = 34
ED25519_PUBLIC_KEY_SIZE = 32
V3_ONION_SERVICE_ID_CHECKSUM_OFFSET = 32

def validate_onion_address(service_id):
    try:
        # Initialize the bytearray to hold the decoded service ID
        decoded_service_id = bytearray([0] * V3_ONION_SERVICE_ID_RAW_SIZE)

        # Decode the service ID from base32 into the decoded_service_id bytearray
        decoded_service_id = base64.b32decode(service_id.encode(), decoded_service_id)

        # Check decoded service ID has the expected length
        if len(decoded_service_id) != V3_ONION_SERVICE_ID_RAW_SIZE:
            raise ValueError("Invalid service ID length")

        # Check the version byte
        version_byte = decoded_service_id[V3_ONION_SERVICE_ID_VERSION_OFFSET]
        if version_byte > 0x03:
            raise ValueError(f"Warning: Unknown version byte {version_byte}")

        if version_byte != 0x03:
            raise ValueError("Invalid version byte")

        # Extract the public key from the decoded service ID
        public_key = bytearray(decoded_service_id[:ED25519_PUBLIC_KEY_SIZE])

        # Calculate the truncated checksum
        truncated_checksum = calc_truncated_checksum(public_key)

        # Check if the truncated checksum matches the corresponding bytes in the decoded service ID
        if truncated_checksum[0] != decoded_service_id[V3_ONION_SERVICE_ID_CHECKSUM_OFFSET] or \
                truncated_checksum[1] != decoded_service_id[V3_ONION_SERVICE_ID_CHECKSUM_OFFSET + 1]:
            raise ValueError("Invalid checksum")

    except Exception as e:
        # Raise an exception if an error occurs
        raise ValueError("Invalid service ID: " + str(e))

def calc_truncated_checksum(public_key):
    # Define the size of the hash in bytes
    SHA256_BYTES = 256 // 8
    hash_bytes = bytearray(SHA256_BYTES)

    hasher = hashlib.sha3_256()
    assert SHA256_BYTES == hasher.digest_size

    # Calculate the checksum
    hasher.update(b".onion checksum")
    hasher.update(public_key)
    hasher.update(bytes([0x03]))
    hash_bytes = bytearray(hasher.digest())

    return [hash_bytes[0], hash_bytes[1]]

robertmin1 commented 12 months ago

Made changes to handling the version byte

JeremyRand commented 10 months ago

The UI element that shows onion errors seems to be user-editable, which is not desirable. Not sure if maybe you accidentally used a line edit widget instead of a label widget?

Also if I type blah.onion in IPv4 mode, and then switch the drop-down to Tor, validation doesn't happen until I edit the address again.

JeremyRand commented 10 months ago

For validating IP addresses, the ipaddress module in the Python standard library looks like what you want.