filecoin-project / rust-fil-proofs

Proofs for Filecoin in Rust
Other
489 stars 314 forks source link

Rationalize parents cache key derivation. #1192

Open porcuquine opened 4 years ago

porcuquine commented 4 years ago

Description

The parents cache key is currently derived from an explicit reference to the Feistel keys, as well as to other information which identifies the DRG (sector size, hash, base and expansion degrees, etc.).

However, the DRG seed is not explicitly included in the derivation — and it should be: if the DRG seed changes, the cache will be invalid.

In practice, this is still safe because both Feistel Keys and DRG seed are derived from porep_id. Therefore, if DRG seed changes, Feistel keys must also. The current mechanism is somewhat elaborate and involves reuse of the DRG's identifier() method — which is also used to construct the circuit identifier. As noted in a comment, the DRG seed should not be included there, because by design the DRG seed could be modified without a change to circuits.

One 'correct' solution would be to separate the identifier into two methods, a circuit_identifier and a cache_identifier (or some other semantically consistent names). The latter could depend on the former but enhance it with the DRG seed. A similar split would need to also happen for the composite expander+drg graph, which already depends on the extant identifier. To get this normalized right, might be somewhat tricky (since the new circuit_identifier must not depend on the DRG's cache_identifier, but the new cache_identifier must do so while also depending on its own corresponding circuit_identifier.

Alternately, the cache could just be made to depend directly on the porep_id. This is reasonable, since that value is used to uniquely determine a graph. Graphs must change if porep_id does and not otherwise.

Acceptance criteria

Risks + pitfalls

It might be easy to get this requirement wrong when refactoring the identifiers:

NOTE: since the expected outcome is a change to parents cache, all currently-deployed caches will be invalidated by the change.

Where to begin

dignifiedquire commented 3 years ago

@porcuquine is this still needed?

porcuquine commented 3 years ago

Technically, yes. If not, should at least put a reference-to/explanation-of the issue in the parents cache key derivation (cache_path) and a reminder that derive_porep_domain_seed must not change without potentially silently breaking caches.

hzsong123 commented 3 years ago

hello, my dear friend. I have no idea about the concept of DRG an DRG graphs. Which urls I can find them? thank you very much.

vmx commented 3 years ago

@hzsong123 you can find more information about it in the Filecoin specification.

cryptonemo commented 3 years ago

This appears to have been resolved here: https://github.com/filecoin-project/rust-fil-proofs/commit/332d0a9899e85640b53cedc7e8be5957c8f1e03e

And used here: https://github.com/filecoin-project/rust-fil-proofs/blob/master/storage-proofs-porep/src/stacked/vanilla/cache.rs#L414

@porcuquine If you agree, please close this issue. If not, the acceptable criteria is not clear.

porcuquine commented 3 years ago

Looking at your second link, I see that the parent cache key is derived from the Feistel keys only.

The relevant acceptance criterion was:

Parents cache key includes either both DRG Seed and Feistel keys (less good) or neither but does contain porep_id (better).

For this to be true, the cache_path function should be hashing the porep_id, not the Feistel keys (or — less good — additionally hashing the DRG seed).

Does that clarify the meaning?

cryptonemo commented 3 years ago

Looking at your second link, I see that the parent cache key is derived from the Feistel keys only.

The relevant acceptance criterion was:

Parents cache key includes either both DRG Seed and Feistel keys (less good) or neither but does contain porep_id (better).

For this to be true, the cache_path function should be hashing the porep_id, not the Feistel keys (or — less good — additionally hashing the DRG seed).

Does that clarify the meaning?

Not really. The feistel keys are derived from the porep_id, as is the drg_seed. It's not clear to me what's missing since the parent's cache_id is then derived from the feistel keys (we had an upgrade that proved this, no?)

The first link contains the following, but here are more direct links:

https://github.com/filecoin-project/rust-fil-proofs/blob/master/storage-proofs-porep/src/stacked/vanilla/graph.rs#L80 https://github.com/filecoin-project/rust-fil-proofs/blob/master/storage-proofs-core/src/crypto/mod.rs#L12 https://github.com/filecoin-project/rust-fil-proofs/blob/master/storage-proofs-core/src/drgraph.rs#L243

cryptonemo commented 3 years ago

Perhaps the confusion is that it appears correct, but without a comment on the derive_porep function, problems could potentially be introduced in the future if someone changed how that's hashed. If that's the case, I agree that a comment should suffice.

porcuquine commented 3 years ago

The feistel keys are derived from the porep_id, as is the drg_seed. It's not clear to me what's missing since the parent's cache_id is then derived from the feistel keys (we had an upgrade that proved this, no?)

Your observation is the same thing I meant by this line in the original issue:

In practice, this is still safe because both Feistel Keys and DRG seed are derived from porep_id.

In other words, this is incidentally correct because Feistel keys happen to depend on porep_id. However, this issue is about a fix which would capture the actual requirement directly.

I am not saying fixing this is high priority. I am just clarifying what the issue is about. An example of how this could matter is that with this implementation, there is an implicit requirement that the DRG seed never change independent of the Feistel keys. While I don't think it is likely that we will accidentally or intentionally violate this requirement in the future, it would be more correct to not need to track the implicit requirement and just change the cache key to rely directly on the porep_id — since the assigned semantics of that identifier make it precisely the one on which the cache key should depend. The graph for any given set of parameters changes if and only if the porep_id changes.

Changing the implementation would fix the root problem so it won't ever manifest. You can also try to add defensive comments or just ignore the issue as more trouble than its worth. I would most like to eventually see a real fix only because these kinds of correctness issues are almost impossible to keep track of. The best way to encode correctness/security requirements is directly in code. Otherwise, we rely on contextual knowledge of the whole system — whether retained in a few people's heads or scattered in comments, or even in the spec. Even though this particular issue seems unlikely to ever come up, the only defensive mechanism I personally have actual confidence in is implementing the code in a way that is 'precisely correct by construction'.

Not having done this in the first place wasn't a huge deal, but it is technical debt. Resolve as you see fit.