arkworks-rs / crypto-primitives

Interfaces and implementations of cryptographic primitives, along with R1CS constraints for them
https://www.arkworks.rs
Apache License 2.0
167 stars 79 forks source link

Potential refactoring of `TwoToOneCRH` #58

Closed weikengchen closed 3 years ago

weikengchen commented 3 years ago

Our current TwoToOneCRH takes left_hash_bytes and right_hash_bytes as input. As you can see, it has a strict requirement that both left and right must be bytes.

This is unsatisfactory for Poseidon, as evidenced by the fact that @drewstone and @filiplazovic have been working hard to bypass this limitation.

One potential solution is to make sure that TwoToOneCRH is merely a compressing CRH, and its inputs are of the same format as its output:

pub trait TwoToOneCRH {
    type Output: ToBytes
        + Clone
        + Eq
        + core::fmt::Debug
        + Hash
        + Default
        + CanonicalSerialize
        + CanonicalDeserialize;
    type Parameters: Clone + Default;

    fn setup<R: Rng>(r: &mut R) -> Result<Self::Parameters, Error>;

    /// Evaluates this CRH on the left and right inputs.
    fn evaluate(
        parameters: &Self::Parameters,
        left_input: &Self::Output,
        right_input: &Self::Output,
    ) -> Result<Self::Output, Error>;
}

This one will work, except for one other limitation that we previously discussed: this prohibits the use of more than one hash function, of different types. We refactor the Merkle tree all for this!

An example use case comes from Fractal, where ideally the bottom level of the Merkle tree is some sort of hash function that can be efficiently computed outside SNARK, e.g., Blake2b/s or another Poseidon.

So, we have two needs that are in conflict:

(1) provides a general framework such that different kinds of CRH can interact, e.g., making inputs [u8] and making outputs ToBytes.

(2) provides an efficient framework for CRH to not go through [u8] all the time.

Proposed solution 1: making TwoToOneCRH dual-interface

An alternative interface is to support both:

pub trait TwoToOneCRH {
    /// The bit size of the left input.
    const LEFT_INPUT_SIZE_BITS: usize;
    /// The bit size of the right input.
    const RIGHT_INPUT_SIZE_BITS: usize;

    type Output: ToBytes
        + Clone
        + Eq
        + core::fmt::Debug
        + Hash
        + Default
        + CanonicalSerialize
        + CanonicalDeserialize;
    type Parameters: Clone + Default;

    fn setup<R: Rng>(r: &mut R) -> Result<Self::Parameters, Error>;
    /// Evaluates this CRH on the left and right inputs.
    ///
    /// # Panics
    ///
    /// If `left_input.len() != Self::LEFT_INPUT_SIZE_BITS`, or if
    /// `right_input.len() != Self::RIGHT_INPUT_SIZE_BITS`, then this method panics.
    fn evaluate(
        parameters: &Self::Parameters,
        left_input: &[u8],
        right_input: &[u8],
    ) -> Result<Self::Output, Error>;

    /// Compress two CRH outputs.
    fn compress(
        parameters: &Self::Parameters,
        left_hash: &Self::Output,
        right_hash: &Self::Output,
    ) -> Result<Self::Output, Error>;
}

where compress is preferred except when touching the bottom layer.

This is still suboptimal, as there is still an unnecessary conversion between the leaf and non-leaf layers, even if they are both Poseidon.

Proposed solution 2: wrapping the LeafHash to match the TwoToOneCRH

We can provide a wrapper, such that when it wraps a LeafHash, it matches the LeafHash's output to the TwoToOneCRH's output.

Then, in the Merkle tree, we can require LeafHash's output = TwoToOneCRH's output, and if LeafHash is of a different format, we use a wrapper, like:

BytesToFieldElementWrapper<LeafHash>

If the LeafHash and TwoToOneCRH are the same, this wrapper is not needed.

Thoughts? @ValarDragon @tsunrise

weikengchen commented 3 years ago

A variant of proposed solution 2 is the wrap in a different way:

This would add a little bit more complexity, as we will need to specify the type of Input in CRH, which we haven't done before.

However, this would also allow more generality if in the future, we may treat the layer right above the leaves differently.

tsunrise commented 3 years ago

Just to confirm: For field-based CRH like Poseidon as two-to-one CRH:

For byte-based CRH as two-to-one CRH:

Is that correct? This idea sounds good to me

tsunrise commented 3 years ago

Do you want me to implement this? I have some spare time this summer @weikengchen

weikengchen commented 3 years ago

Please go ahead. This week is a little bit special in that I am very swamped.

tsunrise commented 3 years ago

Just realized this PR is also related to my BCS implementation. Will start implementing it this week

Pratyush commented 3 years ago

An alternative design is to have TwoToOneHash to have two kinds of inputs: "hash_bytes" and "hash_field_elements" (similar to what we have in the sponge API), and to constrain the Output type to implement ToBytesGadget and ToConstraintFieldGadget.

Additionally, you can have an an associated constant const SNARK_FRIENDLY: bool (we can decide a better name) which is set to true for arithmetization-friendly hashes like Poseidon. Then, the Merkle tree code can use this flag to decide whether to serialize to bytes or to field elements.

weikengchen commented 3 years ago

Yes, Tom and I have some internal discussion and we are likely going toward that direction (due to difficulty relating to enforcing the bound LeafCRH::Output == NonLeafCRH::Input in Rust)

tsunrise commented 3 years ago

That makes sense. I will try changing the existing design.

Writing wrapper for CRH constraints is a nightmare. I end up getting around 7 generics for one type and the fully associated syntax is a mess. Hopefully, using hash_bytes and hash_field_elements interface will address this problem.

Pratyush commented 3 years ago

Yeah I think we should try to reduce generics where possible (it's painful to write gadgets with them, as I'm sure all of you have experienced)