Closed ChanningBarnes closed 1 week ago
Looks good - nice job researching this
@ChanningBarnes don't forget to add the milestone
@c-rubio What are your thoughts on this?
Regarding Ethereum, some more specificity is required to establish validity. Ethereum addresses are a 20 byte combination. The 0x indicates it is written in hexadecimal, so the next 40 characters should actually be 20 bytes written in hexadecimal number pairs. Ethereum Whitepaper, Valid Wallet Addresses
Thus, we must ensure that a supplied ethereum address starting with 0x is in the valid hexadecimal format, which accepts digits 0 - 9 and letters A - F. This creates a pair range 00 - FF. Each pair of characters is a byte (so 20 pairs is a complete address).
Description A wallet number address for a crypto wallet. This is a unique string of alphanumeric characters used to send and receive cryptocurrency. These are typically pseudonymous, however they can be traced back to a personal identity.
We will provide detection for Bitcoin and Ethereum addresses.
Detection Methods
Bitcoin Microsoft's Data Protection SDK, Presidio, only supports the CRYPTO identity type for Bitcoin addresses. They do this through pattern match, context, and checksum.
Bitcoin commonly has 3 different address formats: P2PKH (Pay-to-public key hash), P2SH (Pay-to-script hash), and Bech32.
P2PKH addresses always start with "1" (ex: 18HhEaHH4cvGvyWwgAYeM1H8vzHtVKYyf1) P2SH addresses always start with "3" (ex: 3H28N5WuREZ93CNmhWcRcrnykWrMqkhFyWN) Bech32 addresses always start with "bc1" (ex: bc1qu5ujlp9dkvtgl98jakvw9ggj9uwyk79qhvwvrg)
Bitcoin addresses are typically between 26-35 characters in length.
We will utilize Presidio to detect Bitcoin addresses.
Ethereum
For Ethereum, their public addresses all share the same format: They begin with "0x" followed by 40 alphanumeric characters. (ex: 0x71C7656EC7ab88b098defB751B7401B5f6d8976F)
To detect an Ethereum address, we will check for any strings that begin with "0x" and are followed by 40 alphanumeric characters.