SMIP-0011: GPU-POST LIB - PHASE II

avive commented 3 years ago

Motivation

Tortoise beacon VRF.

Algorithm Outline

Phase I: create the PoST initialization but tweak it so that for every scrypt output label before truncating the 256-bit output into ℓ ≤ 256 bits (and storing to disk these bits) we compare the full 256 bits against 256-bit constant parameter D and if this label is smaller than than D then we store the index of this label (if we already found label smaller than D and we already stored its index, then we simply overwrite it with the new index – I’m assuming that the index is 64-bit words and therefore the write operation is atomic, I don’t think that we need index whose size is more than 64-bit because even if we store 1 bit of the scrypt output per index the total storage would be more than 2 million terabytes).

Phase II: if we didn’t find any label smaller than D in phase1, then we continue with a “dry” run of the initialization that continue from the last index onwards but doesn’t write anything to disk (so there’s no need for it to truncate the scrypt outputs), and when it find a label smaller than D it stores its index and terminates.

API Design

Add 2 config options to scryptPositions() params: 1. Compute leafs. 2. Compute PoW. One of these config options should be turned on by api client.

enum {
    COMPUTE_LEAFS = 1 << 0,
    COMPUTE_POW = 1 << 1,
};

Motivation

API clients can just stop calling the method with the 2nd option set, after the POW was solved and in case the algorithm needs to progress to phase II (described above) then the api client will just call the method with the 2nd config option on as label computation is not needed.

Add these new method params:

uint64[4] D, // Target D for the POW computation. 256 bits.
uint32_t options, // compute leafs and or compute pow
uint64_t idx_solution, // index of output where output < D if POW compute was on

Implementation
- If option 1 is on then a compute iteration (an api method execution on the gpu) computes the labels and stores them in the output buffer (the current lib functionality).
- If option 2 is on, then the comparison with D is executed when a has for an index is computed, and an index that solves the POW is outputted by the API. (idx_solution method output param).
API method after the changes above:

int scryptPositions(
    const uint8_t *id,      // 32 bytes
    uint64_t start_position // e.g. 0
    uint64_t end_position,  // e.g. 49,999
    uint32_t hash_len_bits, // (1...256) for each hash output, the number of prefix BITS to copy into the buffer
    const uint8_t *salt,    // 32 bytes
    uint8_t *out,           // memory buffer large enough to include hash_len_bits * number of requested hashes
    uint32_t N,             // scrypt N
    uint32_t R,             // scrypt r
    uint32_t P,             // scrypt p

    uint64[4] D,                           // Target D for the POW computation. 256 bits.
    uint32_t options,           // compute leafs and or compute pow
    uint64_t *idx_solution          // index of output where output < D if POW compute was on. MAX_UINT64 otherwise.
);

There is no need to provide memory buffer via out param when only the POW option is set as leaves will not be written to it in this execution mode.

API usage pattern

Start calling scryptPositions() with both options set so both leaf computation and D comparison are made.
If a solution is found before all leaves were computed in an iteration then store idx_solution and unset the POW option from future calls to scryptPositions(). 3.if all leaves computed and a solution was not found then continue calling with larger indexes with only POW option set until a solution is found.

Implementation Notes

idx_solution doesn't need to be locked because we only care if there was one solution or not, so it is okay if it is overwritten in case that 2 solutions are found in an execution of the api method.
In the case that only COMPUTE_POW flag is set (execution is only looking for the PoW solution and not creating any output labels), the execution should stop and return the result as soon as it is found.

Open Issues

Is it sufficient to have end_position by uint64_t in the case that only COMPUTE_POW flag is set or do we need to support uint64_t[4] sized indexes?

moshababo commented 3 years ago

Is it sufficient to have end_position by uint64_t in the case that only COMPUTE_POW flag is set or do we need to support uint64_t[4] sized indexes?

The 64-bit position range was compliant with the max supported number of labels. To support 256-bit difficulty param, will even uint64_t[4] be enough?

avive commented 3 years ago

I think it will unless I'm missing something obvious here as uint64 = 8 bytes => uint64[4] = 32 bytes = 256 bits

moshababo commented 3 years ago

I was referring to the probabilistic nature of PoW, and that in the current API design there won't be a way to introduce hash preimage variants besides by incrementing the position. I don't think we'll have an issue with 256-bit range (or less), but I was still wondering about it.

avive commented 3 years ago

Doesn't the 32 bytes id input introduce a preimage variant?

moshababo commented 3 years ago

Yes but the user commits to a specific id when starting the PoST initialization so he cannot change that for the PoW part if he runs out of positions/indices.

avive commented 3 years ago

Right, he should keep using the same id for the PoW part - I think this is by design like that. So what's the issue about this?

avive commented 3 years ago

The smip has been numbered and this goes into development in few days.

avive commented 3 years ago

Phase III implementation is complete. https://github.com/spacemeshos/gpu-post/releases/tag/v0.1.19

avive commented 2 years ago

@noamnelke

countvonzero commented 2 years ago

@avive what is the reason to reopen this SMIP? is it in response to https://community.spacemesh.io/t/possible-dos-against-pops-vrf-beacon-and-mitigation/210 ?

countvonzero commented 2 years ago

quoting Tal in slack. in the context of https://community.spacemesh.io/t/possible-dos-against-pops-vrf-beacon-and-mitigation/210

Regarding Aviv's library, I don't think it will work for nonce generation, since the parameter regime 
it's focused on is a difficulty of less than 1 (i.e., "solving" the PoW is just a single hash invocation, 
of which we store part of the output.).

spacemeshos / SMIPS