Closed porcuquine closed 5 years ago
Hi @porcuquine, I could be wrong, but I have a couple of comments:
If HVHPoST is the construction I have in mind, I think there is an error in its description here: namely point c) does not happen: a) hashes the leaves of the current challenge set b) performs a VDF on this hash c) hashes the result d) derives a new set of challenges with a hash-per-challenge
Indeed, what happens is: i) Hash leaves and obtain h ii) Feed the VDF with h ovtaining v <-- VDF(h) iii) Obtain challenges c_1,...,c_n as c_i <-- H(v || i)
I can be missing something, but...cant we put the Challenge derivation outside the circuit? in order to derive challenges one only needs a VDF outputs v. Then it is enough to compute c_j = H(v || j) for each j. Moreover, the verification that c_j is well formed can be done by anybody (even locally, right?).
I don't know if this comment actually generates more confusion. I hope not, but of course we can have further discussion!
Thanks, @lucaniz. I'll try to address your comments. It's completely possible I'm misunderstanding some details since I've only just begun working on this recently.
Regarding your point 1, as I read it the substantive difference between my list and yours is step c) hashes the result. As far as I can tell, the other points are just differences in phrasing.
I believe step c) really does happen in the current implementation at least. Here's the relevant code from src/hvh_post.rs
, starting at line 165:
let (y, vdf_proof) = V::eval(&pub_params.pub_params_vdf, &x)?;
proofs_vdf.push(vdf_proof);
ys.push(y);
let r = H::Function::hash_single_node(&y);
The first line above is step b), and the last line is step c).
Regarding your point 2), there is no problem with generating all the challenges outside the circuit for the prover. For the verifier to do the same, one of these would have to be true:
epoch_times
outside the circuit.The problem with 1. is that it sacrifices fast verification.
The problem with 2. is that it makes the circuit inputs large, which also means that all the VDF outputs have to be stored somewhere (on chain?) and included as part of the proof.
Maybe there is another option, but my original method above was intended to get around these problems. It does this by separating challenge-generation into two phases, each of which can be efficiently verified either in or out of the circuit as appropriate. This allows us to only provide all VDF results as private inputs to the circuit, since they are used to verify the challenge mixing but are not required for the 'partial challenge' generation. That way, partial challenge generation can be performed/verified outside the circuit.
Hey @porcuquine, thanks for your comments! Wrt my comment 1: you got the point! I was just referring to point c) as the "questionable" one. all the rest was rephrasing. Sorry if it was not clear.
Wrt my point 2: i was actually thinking to what you describe as point 2., namely
"2. we include all VDF results in the public inputs provided to the verifier."
But it's true that now circuit input is large. I'll double check your proposal tomorrow and I hope to get back to you with useful comments! :)
Hi @porcuquine, I got a bit confused by challenge mixing paragraph. I'll try to summarize what I got and what I'm confused about and we can then go over it together.
partial_challenges
(pc
from now on) are built using a seed s
and hashing with and integer i
. So pc_i = H(s || i)
mixed_challenges
(mc
from now on) are the sum of a partial challenge and a VDF output v
. Namely, mc_i = pc_i + v
final challenge
(fc
from now on) is the modular reduction of a mixed challenge (mod the number of challenges, which should be your challenge_count
and, if I'm right, is 200 in real life for now). So, fc_i = mc_i % challenge_count
.If all above is correct (which implies I got right, so please double check), now I do not understand what challenge_r
is. You refer to it as a reminder, but the only reminder I see here is fc_i
, which is the result of the modular reduction. Moreover, you say that challenge_r * challenge_count
should be equal to mixed_challenge
.
But if so, since fc_i = mc_i % challenge_count
we should have fc_i = (challenge_r * challenge_count) % challenge_count
, but (challenge_r * challenge_count) % challenge_count
is 0 (if % is the modular operation).
I'm definitely missing something but I don't get what...happy to have a call if needed.
Moreover, you say that:
inside the circuit we assert the constraint that
challenge_r * challenge_count == mixed_challenge
wheremixed_challenge = partial_challenge + vdf_output
. In other words, if we could safely express this, we would need only one constraint per generated challenge: challenge_r * challenge_count - vdf_output == mixed_challenge.
I think there is a typing error here, since it should be challenge_r * challenge_count - vdf_output == partial_challenge
. Right?
Thanks, @lucaniz. The confusion is probably because I first describe a method that won't work — then describe a replacement without going into perfect detail about which changes are needed. I think I can explain here, though, and happy to have a call if that helps.
partial_challenges (pc from now on) are built using a seed s and hashing with and integer i. So pc_i = H(s || i)
This is correct.
mixed_challenges (mc from now on) are the sum of a partial challenge and a VDF output v. Namely, mc_i = pc_i + v
This is correct. However, to simplify verification (as you note below), we may want to define this as the difference (mc_i = pc_i - v
) instead.
final challenge (fc from now on) is the modular reduction of a mixed challenge (mod the number of challenges, which should be your challenge_count and, if I'm right, is 200 in real life for now). So, fc_i = mc_i % challenge_count.
This was a straw man, which I later claim is hard to support in the circuit.
I then describe a replacement method and provide the constraint for checking it — but I don't explicitly describe how to generate the values being checked. That should be something like,
Assuming 32-bit challenges and a ~256-bit field, we can derive four challenges from each
mixed_challenge
. Outside the circuit, we can do this by splittingmixed_challenge
intochallenge_bits
-bit chunks. (In this case,challenge_bits
is 32.) If we treatmixed_challenge
as a little-endian byte array, this yields
c1 = mixed_challenge[0..challenge_bits]
c2 = mixed_challenge[challenge_bits..2*challenge_bits]
cn-1 = mixed_challenge[(n-1)*challenge_bits..n*challenge_bits]
Now my final edit (hopefully) makes sense. We check derivation of final challenges (c1
, c2
, c3
, c4
) like so:
partial_challenge = (c1 + vdf_output) + (2 * c2) + (4 * c3) + (8 * c4)
Does that help?
Got it, thanks @porcuquine ! So actually the most important part is in the EDIT paragraph, right?
Yes. I thought the rest provided clearer background and didn't want to remove it completely — so I just slapped the conclusion in but tried to set it off. If we can find a way to make this work, we'll write it into the spec cleanly without the buildup.
Updating to mention that we discussed this at the proofs sync today. Consensus is that the idea here is viable so we are going to move forward with implementing. If problems are discovered along the way, we will deal with them then.
NOTE: my original description used the wrong coefficients for the effective bit-shifting when packing challenges. I've corrected it above and include the original here for comparison.
Was: partial_challenge = (c1 + vdf_output) + (2 * c2) + (4 * c3) + (8 * c4)
Should be: partial_challenge = (c1 + vdf_output) + (2^8 * c2) + (2^16 * c3) + (2^24 * c4)
One more — my original example, miscalculate 256 / 32. In fact we get 8 rather than 4 challenges:
partial_challenge = (c1 + vdf_output) + (2^8 * c2) + (2^16 * c3) + (2^24 * c4) + (2^32 * c5) + (2^40 + c6) + (2^48 + c7) + (2^56 * c8)
Sorry for the initial mistakes and flow of updates — trying to keep this somewhat accurate as a reference while I implement.
Update: this is implemented in https://github.com/filecoin-project/rust-proofs/pull/436. Because of dev bandwidth and priorities, verification in circuit is not yet implemented. However, vanilla verification is provided and tested as evidence that the algorithm is accurate.
Started speccing this out here: https://github.com/filecoin-project/specs/blob/update-post/drafts/proof-of-spacetime.md
Linking challenge sampling and seed biasing considerations to this issue
this is only relevant for VDF post, closing this for now.
Problem
Challenge-generation for PoSt is expensive in circuits.
HVHPoSt does the following
post_epochs
times:This means that, in addition to verifying the VDF, we have to verify
post_epochs * (challenge_count + 2)
hashes in the circuit.Proposal
Split challenge generation into an outside-circuit pre-computation provided as circuit public inputs (derived from overall public inputs) — and an inside-circuit computation which mixes in VDF results.
The pre-computation allows us to ensure appropriate random distribution of challenges. The inside-circuit phase forces challenges to depend on VDF results.
Method
Outside of Circuit
Define a suitable challenge-derivation function to run outside of circuits. The output of this function will be a vector of 'partial challenges'.
This can work however we want. For examples, input can be a random seed from the chain, a number of challenges, and a description of the sectors to be challenged. For the sake of simplicity, consider the case of a single sector. Challenge derivation can be as it is for ZigZag currently. The nonce (
replica_id
for ZigZag — chain randomness for PoSt) is concatenated with an incrementing integer, hashed, then modded into the desired range. Since we needpost_epochs * challenges
total challenges altogether, we request and generate that many 'partial challenges'.[As below, if we can extract multiple final challenges from a single partial challenge, then generate fewer partial challenges by that factor.]
Inside Circuit
Instead of the previous construction, we eliminate the hash-per-challenge and one of the H in HVH.
Set
partial_challenges
to the result of the precomputation.For each of
post_epochs
verify that these steps were performed outside of circuit:vdf_input
from challenges + data. [Requires verifying one hash.]VDF(vdf_input) -> vdf_output
. [Requires cheap verification of one VDF.]current_challenges
to be the nextchallenge_count
challenges frompartial_challenges
.vdf_output
intocurrent_challenges
to cheaply derive the actual challenges. [See description below]Challenge mixing
We want to force challenges to depend on VDF output while preserving the well-distributed randomness provided by the pre-computed partial challenges. Conceptually, we could do this with
XOR (partial_challenge, vdf_output) % challenge_range
.However, since XOR is not cheap inside of circuits, we can instead use addition (as we do in ZigZag when
sloth_iter = 0
, justified by the same logic) to getmixed_challenge
. Outside of circuits, we computemixed_challenge % challenge_count == final_challenge
and retain the remainderchallenge_r
as an auxiliary input.Inside the circuit, we assert the constraint that
challenge_r * challenge_count == mixed_challenge
wheremixed_challenge = partial_challenge + vdf_output
.In other words, if we could safely express this, we would need only one constraint per generated challenge:
challenge_r * challenge_count - vdf_output == mixed_challenge
.However, there is a subtlety. If we can freely choose
challenge_r
, we can exploit the modularity of the circuit field to verify the choice of anyfinal_challenge
. Therefore, we need to also prove thatchallenge_r <= R / challenge_range
(whereR
is the size of the circuit field).I am not sure if we can do that cheaply with a SNARK.
But all is not lost: since we intend to constrain sector sizes to be a power of 2, we don't need the generality of arbitrary mod operations. Instead, we can force each
final_challenge
to be in range by construction. We do this by extracting multiple challenges from a singlemixed_challenge
. Remember that challenges in circuits are expressed as bit-strings (auth_path
).Consider that if one challenge corresponds to a merkle proof with
auth_path
of 32 bits (4GiB sectors — for sake of example), andR
is 256 bits, then we can extract 8final challenges
from a singlemixed_challenge
.Call these 8 challenges
c1
,c2
,c3
,c4
,c5
, c6,
c7,
c8. Then we just need to prove that
mixed_challenge == c1 + (2^8 c2) + (2^16 c3) + (2^24 c4) + (2^32 c5) + (2^40 + c6) + (2^48 + c7) + (2^56 * c8)`.~That can be accomplished with only 3 constraints for all four challenges:~ [EDIT: I updated the above to correctly divide 256 / 32 into 8 challenges, not 4. I'm not going editing this historical reflection to reflect that.]
t1 = c3 + c4 * 2^8
~t2 = c2 + t1 * 2^8
~mixed_challenge = c1 + t2 * 2^8
~Therefore, in this example (uniform 4GiB) sectors, we have one constraint per challenge: 1 to compute
mixed_challenge
(or can we omit this and combine the addition?) plus 3 to prove the packing of individual challenges — for every 4 challenges.EDIT
Actually, can't we express all of this as one constraint, a single linear combination relating
c1
,c2
,c3
,c4
, andmixed_challenge
?Can we also compute the
mixed_challenge
in the same constraint and only specify this?We still have to verify
epoch_count
VDFs, but this is cheap by design; and we still have to performepoch_count
hashes to extractvdf_input
for each epoch.