Closed vbuterin closed 7 years ago
@tauteh1221 This was bound to happen. I am afraid we can se even larger errors in the future between ICAP Direct and ICAP Basic. The two formats differ by a single character, and there is a ~1% chance the checksum will say "go ahead".
@simenfd Agreed, I even predicted it further up in this thread: https://github.com/ethereum/EIPs/issues/55#issuecomment-186614582
The good news is there are surely non-technical folks using Ethereum now, just have to balance security and features with ease-of-use and failsafes going forward to keep them around.
Since this is implemented in several places now, and since the actual implementation doesn't match that described in the initial post, would it be possible to write this up as a proper EIP and submit it, so one doesn't have to read the whole bug thread to determine what's in actual use?
@Arachnid you're right
Maybe it's a little bit offtopic, but i've tested some SHA3 implementations for php (https://github.com/strawbrary/php-sha3 , https://github.com/0xbb/php-sha3 , https://notabug.org/desktopd/PHP-SHA3-Streamable) and other js-libraries (https://github.com/emn178/js-sha3 , https://github.com/Caligatio/jsSHA/releases/tag/v2.2.0).
Hashing example string: qwerty Hash output variant / length: 224
All of them results in the following hashed value: 13783bdfa4a63b202d9aa1992eccdd68a9fa5e44539273d8c2b797cd Comparing it to the output of the Crypto-JS SHA3 implementation the hashed value completely differs: d7a12ecec4442f1b31eea5f7d5470f0ca6169463e09d91a147c3b8e8
Someone mentioned this issue already at stackoverflow: http://stackoverflow.com/questions/36657354/cryptojs-sha3-and-php-sha3
So checksumAddresses only works "correctly" with Crypto-JS, with other libraries it's failing, because of calculating wrong uppercase and lowercase signs.
What you are seeing is SHA3-224 vs Keccak-224. Check for yourself at: https://emn178.github.io/online-tools/keccak_224.html
What you want is SHA-3, that is the "standard", and most compatible with other libraries.
Remember also that Ethereum is using Keccak, not SHA3.
Short summary because it seems that implementations have evolved and I chased the correct implementation.
Python's implementation:
@chevdor the main tree is at https://github.com/ethereumjs/ethereumjs-util/ and passes the tests listed in https://github.com/ethereum/EIPs/issues/55#issuecomment-187765837
8-f is 50/50.
On Thu, Oct 20, 2016, 2:24 PM Chevdor notifications@github.com wrote:
Short summary because it seems that implementations have evolved and I chased the correct implementation.
- Initially, @vbuterin https://github.com/vbuterin suggested to capitalise whenever the hash character is a..f
- It has been suggested to do 50/50: 9..f
- The current implementation from @axic https://github.com/axic capitalises from 8..f (axic/ethereumjs-util@525196d
diff-168726dbe96b3ce427e7fedce31bb0bcR340
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ethereum/EIPs/issues/55#issuecomment-255217792, or mute the thread https://github.com/notifications/unsubscribe-auth/AAyTgkbzOVgKgo023hX_YeKWcyp_gR3Aks5q184MgaJpZM4HEtnF .
Capitalising for >= 8 or >= a should be identical as 8 and 9 cannot be capitalised anyway.
This isn't correct. You capitalize based on the digit in the sha3
of the
lowcased 40 character (20 byte) hexidecimal representation of the address.
The capitalization is done to the actual characters of the address itself
so there is a difference between >=8 and >=9. >=8 is the correct
implementation.
Another python implementation here: https://github.com/pipermerriam/web3.py/blob/master/web3/utils/address.py#L45
On Fri, Oct 21, 2016 at 3:37 AM Alex Beregszaszi notifications@github.com wrote:
@chevdor https://github.com/chevdor the main tree is at https://github.com/ethereumjs/ethereumjs-util/ and passes the tests listed in this EIP.
Capitalising for >= 8 or >= a should be identical as 8 and 9 cannot be capitalised anyway.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ethereum/EIPs/issues/55#issuecomment-255338050, or mute the thread https://github.com/notifications/unsubscribe-auth/AAyTgk486tOB8GMnuXw2RKBFQYpq30Alks5q2IftgaJpZM4HEtnF .
@pipermerriam I think you are commenting old comments. I discussed with @axic and the topic is clear. I do agree with your comment about >=8 not being the same than >=9 since it is based on the hash.
@chevdor not sure what happened there. I must have been looking at really old email notifications or something. 😄 carry on.. nothing to see here...
@pipermerriam I've commented that without reading the implementation from months ago :smiley:
Initially, @vbuterin suggested to capitalise whenever the hash character is a..f
No. The original proposal capitalizes the n-th hex-digit whenever the n-th bit in the hash of the address is set. So the first 40 bits of the 224 bit hash are used.
The current implementation modifies this by taking the hash of the lowercase hexadecimal encoding of the address and then it uses every fourth bit for capitalization (so 1st bit, 5th bit, etc.). The main reason for this extra complexity is that Javascript or it's libraries are bad at handling binary data, and this is somehow easier.
Here is @vbuterin original implementation updated with these changes. It passes @alexvandesande's test vectors:
from ethereum import utils
def checksum_encode2(addr): # Takes a 20-byte binary address as input
o = ''
v = utils.big_endian_to_int(utils.sha3(addr.hex()))
for i, c in enumerate(addr.hex()):
if c in '0123456789':
o += c
else:
o += c.upper() if (v & (2**(255 - 4*i))) else c.lower()
return '0x'+o
def test(addrstr):
assert(addrstr == checksum_encode2(bytes.fromhex(addrstr[2:])))
test('0x5aAeb6053F3E94C9b9A09f33669435E7Ef1BeAed')
test('0xfB6916095ca1df60bB79Ce92cE3Ea74c37c5d359')
test('0xdbF03B407c01E7cD3CBea99509d93f8DDDC8C6FB')
test('0xD1220A0cf47c7B9Be7A2E6BA89F429762e7b9aDb')
Is there a valid, latest go implementation of this that you could recommend?
Could someone please finally specify the Hash algorithm used to hash the address and get the bits from?
There are at least 3 different hashes mentioned and even used in various imlementations.
My understanding is that the correct hash is supposed to be SHA3-256, but it seems some implementations are using SHA3-224 and others use Keccak-256 and Keccak-224
I am curious what java implementation of this is ?
@almindor
You'll find the correct specification and example implementations at the file here: https://github.com/ethereum/EIPs/blob/master/EIPS/eip-55.md. The file also includes an adoption table to help track the adoption of EIP-55 checksums in the ecosystem.
We're going to close this issue now. If any corrections need to be made (or to update the adoption table), please open a PR on the file.
You should edit the example code and test vectors in the first post. It is wrong and someone who does not read the whole conversation will use the incorrect implementation.
@cdetrio can you push the "official test suite" into the EIP?
I believe it is this one: https://github.com/ethereum/eips/issues/55#issuecomment-187765837
Java checker of ethereum address https://gist.github.com/adyliu/6c5ff4d41aa0177da55f4b8b1703f54a
Current python3 eth-utils implementation
python3 -c "from eth_utils import address; import sys; print(address.to_checksum_address(sys.argv[1]));" 0x5aaeb6053f3e94c9b9a09f33669435e7ef1beaed
Output is
0x5aAeb6053F3E94C9b9A09f33669435E7Ef1BeAed
Thanks
EDITOR UPDATE (2017-08-24): This EIP is now located at https://eips.ethereum.org/EIPS/eip-55. Please go there for the correct specification. The text below may be incorrect or outdated, and is not maintained.
Code:
def checksum_encode(addr): # Takes a 20-byte binary address as input o = '' v = utils.big_endian_to_int(utils.sha3(addr)) for i, c in enumerate(addr.encode('hex')): if c in '0123456789': o += c else: o += c.upper() if (v & (2**(255 - i))) else c.lower() return '0x'+o
In English, convert the address to hex, but if the ith digit is a letter (ie. it's one of
abcdef
) print it in uppercase if the ith bit of the hash of the address (in binary form) is 1 otherwise print it in lowercase.Benefits:
- Backwards compatible with many hex parsers that accept mixed case, allowing it to be easily introduced over time
- Keeps the length at 40 characters
- ~The average address will have 60 check bits, and less than 1 in 1 million addresses will have less than 32 check bits; this is stronger performance than nearly all other check schemes. Note that the very tiny chance that a given address will have very few check bits is dwarfed by the chance in any scheme that a bad address will randomly pass a check~
UPDATE: I was actually wrong in my math above. I forgot that the check bits are per-hex-character, not per-bit (facepalm). On average there will be 15 check bits per address, and the net probability that a randomly generated address if mistyped will accidentally pass a check is 0.0247%. This is a ~50x improvement over ICAP, but not as good as a 4-byte check code.
Examples:
0xCd2a3d9f938e13Cd947eC05ABC7fe734df8DD826
(the "cow" address)0x9Ca0e998dF92c5351cEcbBb6Dba82Ac2266f7e0C
0xcB16D0E54450Cdd2368476E762B09D147972b637
EDITOR UPDATE (2017-08-24): This EIP is now located at https://eips.ethereum.org/EIPS/eip-55. Please go there for the correct specification. The text below may be incorrect or outdated, and is not maintained.
Code:
In English, convert the address to hex, but if the ith digit is a letter (ie. it's one of
abcdef
) print it in uppercase if the ith bit of the hash of the address (in binary form) is 1 otherwise print it in lowercase.Benefits:
The average address will have 60 check bits, and less than 1 in 1 million addresses will have less than 32 check bits; this is stronger performance than nearly all other check schemes. Note that the very tiny chance that a given address will have very few check bits is dwarfed by the chance in any scheme that a bad address will randomly pass a checkUPDATE: I was actually wrong in my math above. I forgot that the check bits are per-hex-character, not per-bit (facepalm). On average there will be 15 check bits per address, and the net probability that a randomly generated address if mistyped will accidentally pass a check is 0.0247%. This is a ~50x improvement over ICAP, but not as good as a 4-byte check code.
Examples:
0xCd2a3d9f938e13Cd947eC05ABC7fe734df8DD826
(the "cow" address)0x9Ca0e998dF92c5351cEcbBb6Dba82Ac2266f7e0C
0xcB16D0E54450Cdd2368476E762B09D147972b637