ifdefelse / ProgPOW

A Programmatic Proof-of-Work for Ethash. Forked from https://github.com/ethereum-mining/ethminer
GNU General Public License v3.0
257 stars 84 forks source link

64 bit seed exploits ASIC resistance #51

Open kik opened 4 years ago

kik commented 4 years ago

uint64_t is too small.

see detail https://github.com/kik/progpow-exploit

OhGodAGirl commented 4 years ago

Hi @kik - thank you for this! Could you clarify in your post the number of keccak items required to produce one (theoretical) ProgPoW hash, and the time to produce them? The feasibility in a <14 second block time environment seems low.

We will adjust to uint128_t regardless. Good find!

SChernykh commented 4 years ago

I don't think this attack is very practical. First you need to solve keccak_progpow_256(header_hash, seed, mix_hash) <= boundary and only then do 2^64 keccak_progpow_64 calculations before the next block is mined on the network. Doing 2^64 keccak hashes in 10 seconds? That'll require huge ASIC farm, and I mean really huge.

Edit: plus, since nonce is also 64 bit, you have only ~63% chance that some nonce will give you matching seed. Even with enough computational power, this attack works in only 63% of cases.

AndreaLanfranchi commented 4 years ago

This issue requires a custom implementation of node generating block templates. Current node implementations require the miner to hash the provided header_hash and not tamper with it. In fact the result of the test is :

Running main() from C:\.hunter\_Base\18e57a4\f8ee682\2e6dac8\Build\GTest\Source\googletest\src\gtest_main.cc
Note: Google Test filter = asic.search
[==========] Running 1 test from 1 test case.
[----------] Global test environment set-up.
[----------] 1 test from asic
[ RUN      ] asic.search
Start nonce 0 Iterations 1024
nonce       = 870
extra nonce = 6992213
Initial header 0xf4ac202715ded4136e72887c39e63a4738331c57fd9eb79f6ec421c281aa8743
Final   header 0x5ff761064cef9cb24192ca623885802b691da1f26751a8c7fa4e100213b4a17b
[       OK ] asic.search (36801 ms)
[----------] 1 test from asic (36803 ms total)

[----------] Global test environment tear-down
[==========] 1 test from 1 test case ran. (36809 ms total)
[  PASSED  ] 1 test.

where a solution is found but on a completely different header_hash So without a node providing the full block template and a change in stratum/getwork protocol so the miner can send back both the nonce and extraData (so the node can validate the header_hash) this, at this very moment, not an active issue.

It's possible though that an "attacker" runs his own custom node.

There is another point though which is easy to amend in PP: Your issue relies on the assumption (correct at the moment) that the last keccak round is function of constant header_hash, mix_hash and seed.

This allows you to do the following:

This can be voided in two ways:

kik commented 4 years ago

We will adjust to uint128_t regardless. Good find!

Why are you seeking lower bound? In this issue, birthday attack is impossible. But if it is discovered, 128bit hash value provides only 64bit security. Use 256 bit.

OhGodAGirl commented 4 years ago

@kik

With 128-bit, we have 2 vs 4 registers taken from the main loop, which forces spilling registers to memory, which in turn results in a performance cliff. In a 256-bit environment, it's worse.

512-bit was dropped to 64-bit to reduce register pressure, hence why hash_mix has PROGPOW_REGS set to 32, where 32 registers are used as part of the random sequence of math. With Ethash, 16 registers were used to store the seed across the hash_mix call.

This is wasteful on your register file - one of the most power and area intense parts of a CPU/GPU core. An ASIC could then store the 512-bit seed on the side in a much more area/power efficient manner - reducing to 64-bits reduces that advantage.

The proposed fix we are suggesting would be to have hash_mix consume all 256 bits of the seed produced by the initial keccak, while having the final keccak consume only 64 bits of the seed. That keeps 256 bits of security, but does not hammer the file register.

There is a trade off to every single step in hardware, and we are careful to ensure there is a balance between what an ASIC can trivially store (and perform) and what a GPU can do. Thus this approach.

What are your thoughts?

@AndreaLanfranchi is also playing with some alternate solutions that reduce register pressure.

mcminer-thomas commented 4 years ago

a) To find a target in 2^64 , even linear searching , is a very difficult thing already , if using the 4000 cores to build 1000 thread with 1Ghz, long time.

b) Mining Pool can't degree this difficulty because it can not get proven of work for every dedicate miners. So it looks it can not used to do the ASIC mining. c) Can it been used to do the attack? It seems that they must have every good "lucky" to do it continues with very huge cost.

So should it be changed? Agree with @SChernykh , different in detail ....

OhGodAGirl commented 4 years ago

@mcminer-thomas It is trivial to adjust this regardless.

Both @AndreaLanfranchi as well as the IfDefElse team have alternate ways to approach this, but in different methods. Both need to be profiled to ensure there is not register spillage or unnecessary waste.

jean-m-cyr commented 4 years ago

@OhGodAGirl Your mitigation (PR #53) only forces register spills on Nvidia SM_30 architecture, which is hardly relevant these days. SMs > 30 do not spill.

AndreaLanfranchi commented 3 years ago

TL;DR; this issue (as is) is not an issue at all

@kik's implementation of the exploit relies on an assumption which is not true for Ethereum: the possibility to create variants of the candidate block's header hash in a fashion similar to bitcoin. This is actually not possible in ethereum and even if supported by a customized node the block propagation of such mined blocks would be immediately blocked by other peers as the header hash in invalid.

Kik's assumption originates (probably) from the presence in block header (the full one, not the hash) of a field named extraData which is actualy a string often used to store vaniity messages (see here or here) or to stamp some signal useful for tracking mined blocks in order to activate attacks (see recent analysis about etc 51% attacks by @OhGodAGirl). The implementation of the exploit assumes this field can be used to store an extraNonce (ethereum's block header does not have an extraNonce field so there's no other place where to store it) which behaves as a variant indicator of the header hash pretty much like, in bitcoin mining, the extraNonce signals a variant of the merkle root. What has been overlooked is that extraData field in ethereum is actually part of the header hash, while in bitcoin extraNonce field is part of the merkle tree hash with the specific meaning of "rebuilt the tree hash with this nonce".

For a detail explanation of the exploit workflow please refer to Kik's documentation

Where the whole reasoning fails is here:

    for (uint64_t extra_nonce = start_extra_nonce; extra_nonce < end_extra_nonce; ++extra_nonce)
    {
       // >>> Begin of the misunderstanding
        const block_header new_header = {header.parent, extra_nonce};
        const hash256 header_hash = ethash_keccak256((const uint8_t *)&new_header, sizeof new_header);
        const hash256 final_hash = keccak_progpow_256(header_hash, seed, mix_hash);
       // <<< End of the misunderstanding

        if (is_less_or_equal(final_hash, boundary))
        {
            const uint64_t end_nonce = start_nonce + iterations;
            for (uint64_t nonce = start_nonce; nonce < end_nonce; ++nonce)
            {
                const uint64_t result_seed = keccak_progpow_64(header_hash, nonce) >> TRUNCATE;
                if (seed == result_seed)
                    return {{final_hash, mix_hash}, nonce, extra_nonce};
            }
        }
    }

As you can see from the above code the attempt is to create variants of the header hash starting from the original header hash, received from work provider (named header.parent), appending to it 64 bits of the extraNonce. The new input of 320 bits is Keccak'ed to return a new 256 bit variant of the header hash. So basically the new header hash is a Keccak hash of the original header hash.

Now ... a customized node would be supposed to store the extraNonce in the extraData field and "seal" the block using the nonce and extraNonce received.

What is missing here is that for ethereum :

header_hash = ethash_keccak256(rlp::encode( parentHash, beneficiary,  difficulty,
             gasLimit, ommersHash, bloom, number, gasUsed, timestamp, extraData,
             stateRoot, txRoot, receiptsRoot))

By consequence either of this may happen :

The way this exploit has been described does not make it impractical ... it's literally impossible !!

This said the only possibility to bypass DAG accesses using a pattern similar to the one described by @kik is to modify these two steps :

        const block_header new_header = {header.parent, extra_nonce};
        const hash256 header_hash = ethash_keccak256((const uint8_t *)&new_header, sizeof new_header);

to become as

        const uint8_t* new_header_rlp = rlp::encode(<all_header_members_with_new_extraData>)
        const hash256 header_hash = ethash_keccak256(new_header_rlp, sizeof(new_header_rlp));

but this adds so much overhead that any ASIC friendlyness is completely voided. Worth to mention that spec 0.9.4. of ProgPoW blocks also such a remote (and totally unpractical) possibility.