SEAL Replica ID should also include a historical ticket

nicola commented 5 years ago

This enforces that the miner is dedicating storage to a particular part of the network

EDIT: Based on discussion below and in spec review, I am changing the title and am adding to the original description here. –@porcuquine

When sealing begins, a ticket should be included in the data hashed to generate the replica id.

This ticket should be from a block which is FINALITY rounds back (at seal start time).

When seal proofs are verified, it must be verified that the round from which the ticket was fetched is less than RECENCY rounds back (at verification time).

This implies that RECENCY must be greater than FINALITY (by the total allowable seal time, in rounds).

RECENCY and FINALITY are both integer constants whose values are yet to be determined.

NOTE: This change will require a change to the sealing process because in general, a Filecoin node does not request sealing. Rather sealing is triggered when a piece is added and this results in a sector being full. For this to work, we should pass the correct ticket (FINALITY rounds back) whenever a piece is added. Then whenever sealing is triggered, the most recent ticket can be used to generate the replica id. cc:@laser.

This requires changes in code by @porcuquine @laser

whyrusleeping commented 5 years ago

@nicola any update here?

nicola commented 5 years ago

@porcuquine should add this to the Filecoin Integration (and spec), let me know if it's unclear

porcuquine commented 5 years ago

I will add to the spec, then we will implement. Let me check some assumptions first:

This can and should be the same hashing method as used for the challenge seed (to PoSt).
This may sometimes require hashing multiple blocks (i.e. a parent set).
This can and should be the same type as proverID and minerID (other components of replicaID).
Specifically, this can be 31 bytes — the type we internally call FrSafe, since it is guaranteed to fit into one field element.

cc: @laser

nicola commented 5 years ago

there should be only one block (technically, we should not use the hash of the block but the winning ticket at the current epoch)

porcuquine commented 5 years ago

@laser Is there now a clear answer to the question of where this 'hash of block' chain randomness would come from and what its characteristics would be? I think we should make the spec change once that's the case, but not before.

nicola commented 5 years ago

Note: the spec should have this regardless of what is implemented today, otherwise will mislead others reading the spec / interested in doing separate implementations

pooja commented 5 years ago

@porcuquine @nicola Can you please update this issue with the latest?

porcuquine commented 5 years ago

@nicola @laser @ZenGround0 @sternhenri @whyrusleeping Does anyone know exactly what value is meant to be added to the replica ID? I have not been able to get this answer, but I suspect that one of you knows it.

If none of us know, I believe between us we can decide. Let's do that here, please.

porcuquine commented 5 years ago

Maybe this is the answer:

there should be only one block (technically, we should not use the hash of the block but the winning ticket at the current epoch)

Is this well-defined? Does it mean the current epoch when sealing begins? If so, is there a restriction on how soon the sector must be committed? If not, what prevents generating replica IDs far in advance and using them much later? That is: how does that differ from using very old tickets to generate replica IDs?

nicola commented 5 years ago

My latest comment is correct: we should use the ticket of the winning blocks

porcuquine commented 5 years ago

Please read my questions again, or consider this:

What if we always add the ticket produced in the very first round?

If this is invalid, what check enforces that?

If it is valid, what benefit does it provide?

In order for this to be useful, it seems we would need to enforce a certain recency on the ticket. We would need to allow for tickets at least as old as required by the fastest possible seal. In order not to force everyone to perform the fastest possible seal, we would probably want to allow tickets as old as some slower but acceptable seal time. With very long seal times, there's the likelihood that some sealing processes will be interrupted and restarted, adding time. We probably want to account for other delays (like being offline when sealing completes).

Taken together, this suggests that ticket recency should be fairly relaxed (i.e. allow for tickets which are quite old relative to the time at which the sector is committed) — since the replica ID needs to have been constructed when sealing begins.

This also suggests that ticket recency needs to be a function of sector size (since it depends on sealing time).

If we were to add this, I think we would need to:

First, define a mechanism by which recency would be checked. For example, will we — at sector commitment time — scan the chain backward looking for a ticket? Or perhaps we will jump back to the oldest allowable ticket (for the committed sector's size) and scan forward.
Second, define how ticket recency is calculated as a function of sector size.

It's entirely possible that I'm missing the point of this plan, or that the implied parts I'm asking about have already been specified elsewhere. If so, please just point me to the relevant explanation or repeat it here.

@whyrusleeping Do you know how this is supposed to work?

whyrusleeping commented 5 years ago

If we're using randomness from the chain at all for mixing here, we should be using the same chain randomness we use for everything else, smallest ticket at tipset X (or its hash).

as @porcuquine says: if we're using values from the chain at all, there needs to be some recency or its pointless (using the first randomness value all the time defeats the purpose). The smallest we can make the limit is PackTime + SealTime + SubmitTime. Likely, we want to add in some additional grace period value that is a significant portion of the sealing time itself.

porcuquine commented 5 years ago

Thanks, @whyrusleeping.

One follow-up question: given that there would need to be a significant lag between a block being mined and it being committed-to in the replica ID of a sector, does this address the problem it's meant to?

In other words, can we define the problem we're trying to solve and ensure the recency requirements make this useful under that model?

As an example, I assume (perhaps wrongly) that this requirement is in some way meant to address the risk of forks. ~If that's the case, and the recency requirement is greater than the finalization period (whether defined or 'pragmatic'), then it's no help.~ [EDIT: I don't really understand the implications of how this would interact with finality.]

Or maybe the point is that the recency requirement will help force finality — since miners will presumably never want to accept chains which invalidate their storage. Is that the idea? If not, maybe it should be. I see finality listed as an [open question])https://github.com/filecoin-project/specs/blob/61d312f545f4b4d7f3c65061024dfb470e8c1d8e/expected-consensus.md) so am not sure what the latest thinking might be.

porcuquine commented 5 years ago

The more I think about this, the less I understand the idea.

Let's say I am a storage miner, and I begin mining. I commit to a chain (i.e. one of potential alternative forks) by adding a ticket to my replica ID.

After some time passes, it turns out that my guess was wrong, and the ticket to which I committed is in fact no longer part of the current best chain.

As a result, I wasted my time and CPU, as well as making deals I can't (yet) support, so my clients also suffer.

How does this help anyone?

I'm probably just not getting it, but I still don't yet understand what problem this solves — and whether it's worth this negative outcome.

sternhenri commented 5 years ago

Please be gentle with this: I may be jumping into something I don't fully get, with missing context. In case it helps @porcuquine, though I strongly defer to @nicola and @whyrusleeping on this one.

While I have little understanding of some of the context here (ie I could be way off base), here is some of what I gather:

We want SEALING to be strongly tied to a given chain, so it enforces some protocol security, preventing miners from flip-flopping across forks, or otherwise enabling nothing at stake (see https://github.com/filecoin-project/consensus/issues/30 for more on why this is a really cool aspect of FIL).
A big part of where this can be relaxed is with PoSTs (which can be easily generated unlike SEALs).

So to me the tradeoff would be between sampling too far back (ie not enforcing much of a commitment to a given chain) and too close to the present (ie risking wasted SEALs for honest miners).

I agree with @whyrusleeping that we should use at least the same randomness as we do for consensus (assuming I've read him right), though we could argue for looking farther back in the case of SEALing (given a greater cost to being wrong, i.e. not just loss of block reward on expectation but waste of a resource and slashing). It would not make sense to look farther back than finality.

Beyond that I don't see why including the hash of a block would be preferable to including a ticket here, though I see downsides to it (grinding) depending on the threat model for PoSTs which I don't have cached.

porcuquine commented 5 years ago

@nicola @whyrusleeping @sternhenri

I updated the issue and changed the title. Please review and see if this seems correct now.

@sternhenri Based on conversation with @whyrusleeping, I wrote that the ticket should be from exactly FINALITY rounds back. I think this is (just barely) consistent with your statement above that 'It would not make sense to look farther back than finality.' Are we all on the same page with these definitions?

whyrusleeping commented 5 years ago

@porcuquine more generally, you should select randomness from a block that is final, otherwise you risk having created an invalid sector.

Really, the point of all of this is to increase the cost of 'historical' forks, where someone goes back in time and tries to create a different chain that is heavier than the current real one. If sectors werent tied to chain, then any currently existing sector could be validly used in the attackers fork (ignoring PoSt issues for a moment). By mixing in chain randomness here, we ensure that an attacker going back a month in time to try and create their own chain would have to completely regenerate any and all sectors they use for their forks power.

porcuquine commented 5 years ago

@whyrusleeping Understood. In practical terms are you saying the ticket should be selected from FINALITY or greater blocks back? If not, do we have a more well-defined way to specify how the ticket should be selected?

sternhenri commented 5 years ago

Yes. That is what I meant.

On Tue, Apr 9 2019 at 18:43, < notifications@github.com > wrote:

@nicola ( https://github.com/nicola ) @whyrusleeping ( https://github.com/whyrusleeping ) @sternhenri ( https://github.com/sternhenri )

I updated the issue and changed the title. Please review and see if this seems correct now.

@sternhenri ( https://github.com/sternhenri ) Based on conversation with @whyrusleeping ( https://github.com/whyrusleeping ) , I wrote that the ticket should be from exactly FINALITY rounds back. I think this is (just barely) consistent with your statement above that 'It would not make sense to look farther back than finality.' Are we all on the same page with these definitions?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub ( https://github.com/filecoin-project/specs/issues/56#issuecomment-481358913 ) , or mute the thread ( https://github.com/notifications/unsubscribe-auth/ADNBa_m3YQFSpMaUotp3AT37bApG4Aleks5vfNE0gaJpZM4Z8QLS ).

whyrusleeping commented 5 years ago

@porcuquine yes, unless miners feel like betting on a chain (which, unless we have true finality, will be probabilistic anyways, and its always 'betting').

Really, this comes down to something like bitcoins '6 block confirmations' thing, where you pick a heuristic of how sure you want to be.

porcuquine commented 5 years ago

Okay, so from what I'm hearing, implementation should provide a value called FINALITY but this doesn't necessarily have to be specified by the protocol. Miners can theoretically set it how they like. RECENCY on the other hand needs to be a protocol-wide constant because it affects proof validity of sector commitments.

I also realize there's a further wrinkle, which is that this check cannot be performed by the FPS — so it's not technically part of proof verification. Rather, it needs to be performed by the node before even verifying the proof. If the recency check fails, then the node shouldn't even bother trying to verify the proof because even a valid proof will be 'invalid' in context. Does that sound right?

From a code perspective, how do you think these values should be specified, given that one may be configurable, and the other is to-be-determined. (It might make most sense for you to have this conversation with @laser, since I'm a bit removed from the go-filecoin code base.)

sternhenri commented 5 years ago

cc @sa8, @zenground0 re our conversations on posterior corruption.

@porcuquine, the EC proofs will provide guidance on finality. (Currently working on it w @sa8).

On Tue, Apr 9 2019 at 20:26, < notifications@github.com > wrote:

Okay, so from what I'm hearing, implementation should provide a value called FINALITY but this doesn't necessarily have to be specified by the protocol. Miners can theoretically set it how they like. RECENCY on the other hand needs to be a protocol-wide constant because it affects proof validity of sector commitments.

I also realize there's a further wrinkle, which is that this check cannot be performed by the FPS — so it's not technically part of proof verification. Rather, it needs to be performed by the node before even verifying the proof. If the recency check fails, then the node shouldn't even bother trying to verify the proof because even a valid proof will be 'invalid' in context. Does that sound right?

From a code perspective, how do you think these values should be specified, given that one may be configurable, and the other is to-be-determined. (It might make most sense for you to have this conversation with @laser ( https://github.com/laser ) , since I'm a bit removed from the go-filecoin code base.)

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub ( https://github.com/filecoin-project/specs/issues/56#issuecomment-481401248 ) , or mute the thread ( https://github.com/notifications/unsubscribe-auth/ADNBawjcm-BXFEs7SZNtZZPYr0zZtJB0ks5vfOlYgaJpZM4Z8QLS ).

porcuquine commented 5 years ago

The proofs spec now includes the need for a ticket when sealing and verifying seal.

The spec doesn't yet reflect the need to verify recency of that ticket outside of the FPS. @whyrusleeping where do you think that should go? (I think proofs.md is probably not the right place.)

@laser Can you create a dev issue that will bring seal and verify seal APIs up-to-date with the spec?

Once those two points are addressed, I think this issue can be closed.

laser commented 5 years ago

How will the prover (the creator of a commitSector message) and the verifier (some miner which received and processes the commitSector message from the network) agree on a block from which the ticket is plucked? Reading through this thread, it does not appear to be the case that the ticket will be included in the commitSector message.

For PoSt, challenge seed randomness is plucked from the block at a height which is equal to the height of the block which marks the start of miner's current proving period (minus lookback). If no block exists at that height, we use the genesis block. A miner's current proving period start-block-height is stored in the state tree - which makes it easier for both the prover and verifier to agree on which block to sample from.

When a miner starts sealing, however, they may not have started proving anything (and thus has no proving period start-block that the network agrees on). So, which block does the miner pluck a ticket from?

If we allow the storage miner to choose a ticket FINALITY blocks back from some arbitrarily-chosen block height, then I am not sure how a seal-verifier will be able to figure out which ticket to pass to verify_seal.

porcuquine commented 5 years ago

I think the miner is meant to choose the most recent known ticket when sealing. I do think this means the ticket's round (or the ticket itself — but round is probably more efficient both to store and to verify) will need to go into the commitSector message. Does that sound right, @whyrusleeping?

porcuquine commented 5 years ago

One more thought: we technically don't need to include anything. Since recency bounds the number of possible values, we could scan (using some sensible heuristic to minimize cost in the normal case) and attempt to verify with every valid ticket. Since verification is relatively cheap this could (in some universe) be worth the on-chain savings. That said, we aren't going to do this, and I mention only for completeness.

I spoke to @whyrusleeping, and he confirms that round number (not ticket) should be included in the commitSector message. cc: @laser

laser commented 5 years ago

@porcuquine

I spoke to @whyrusleeping, and he confirms that round number (not ticket) should be included in the commitSector message. cc: @laser

Roger that. I will put up a spec-repo PR.

laser commented 5 years ago

After speaking with @sternhenri and @porcuquine, it is not clear to me from which round a miner should select a ticket for purposes of creating a replica ID (an input to seal).

Additionally, it is it not clear to me how verification should work. The round number from which the miner plucked a ticket (to create a replica ID) is included by the miner in the commitSector message after sealing completes. A verifier presumably must reject commitSector messages whose round is outside of some range.

@sternhenri - Would you please provide some clarity? Specifically:

which round should a miner select?
how should verification (w/respect to round) work?

cc @dignifiedquire

sternhenri commented 5 years ago

Yes, as best I can tell (should be verified), your understanding of verification is correct. The protocol should specify the valid range for ticket plucking. Anything out of that range should be rejected.

I do think we would want to prevent miners from potentially losing valid SEALS because they plucked a ticket from a block that wasn't finalized, so I would add Finality F to this.

I'll add that there is no incentive for the miner to include a more recent ticket, only incentive to use an older ticket. An older ticket gives them more flexibility to pick subchains on which to PoSt thereafter, a newer one just makes it more likely they pick a non-final ticket.

Variables in the following explanation:

F -- Finality
X -- when miner starts SEALing
Z -- block height in which the SEAL appears
Y -- round in SEAL commitSector
T -- estimated time for SEAL
G -- necessary flexibility to account for network delay and SEAL-time variance

Specifically, in round X the miner starts working on a replica.

Miner draws min ticket from X - F

Due to potential variation in time it takes to SEAL, we want to give some flex to miners (but not too much as that would negatively impact security). Let's call that flexibility G, which should be correlated to variance in SEAL time across miners, with some padding for network-related delay (did block get in on time, did miner immediately submit their completed SEAL, etc). The time to SEAL is T

Verifier V receives a block with a SEAL in round Z, indicating it was made with a ticket from round Y (could be X - F or not, miner could lie), V should check:

Y within G of Z - T - F.

sternhenri commented 5 years ago

https://github.com/filecoin-project/specs/pull/512

filecoin-project / specs

SEAL Replica ID should also include a historical ticket #56