0xPolygonMiden / miden-vm

STARK-based virtual machine
MIT License
632 stars 161 forks source link

Meta data useful for LogUp-GKR #1386

Open Al-Kindi-0 opened 4 months ago

Al-Kindi-0 commented 4 months ago

As we approach a stable version of the LogUp-GKR implementation, we should start thinking about refining the interface between the STARK and GKR. For example, in the current implementation, we open the MLEs of all of the main trace columns when only a subset of these is currently needed in LogUp relation of the virtual bus. Moreover, there is currently no mechanism to open the MLEs of shifted main trace columns (e.g., addr') or periodic columns (more generally transparent/non-committed columns). One first attempt to address the above, is to introduce a meta data structure that can be used in order to build:

  1. The query which is the argument to the final GKR composition polynomial.
  2. The s-column constraints, both boundary and transition.

The following is a skeleton of how the meta data object could look like:


pub struct GkrOracle<E: FieldElement> {
    pub oracles: Vec<Oracle<E>>,
}

#[derive(Debug)]
pub enum Oracle<E: FieldElement> {
    // represents a committed oracle i.e., a column with index `index` of the main trace
    Committed {
        index: usize,
    },
    // represents a shifted (up) committed oracle i.e., a column with index `index` of the main trace
    // that is shifted (up)
    CommittedNext {
        index: usize,
    },
    // represents a transparent oracle i.e., one that can be evaluated by the verifier un-assisted
    TransparentOracle {
        transparent_oracle: TransparentOracle<E>,
    },
}

#[derive(Debug)]
pub enum TransparentOracle<E: FieldElement> {
    EQ { x: Vec<E> },
    Periodic { period: usize, values: Vec<E> },
}

pub fn generate_query_final_check<E: FieldElement>(
    gkr_oracle: GkrOracle<E>,
    eval_point: Vec<E>,
    claimed_evaluations: Vec<E>,
) -> (Vec<E>, /* Object needed for verifying the claimed evaluations */) {
    let mut query = vec![];
    assert_eq!(claimed_evaluations.len(), gkr_oracle.oracles.len());
    for (oracle, claimed_eval) in gkr_oracle.oracles.iter().zip(claimed_evaluations.iter()) {
        match oracle {
            Oracle::Committed { index } => {
                // Use `index` and `claimed_eval` to update the AIR constraints
                // ...

                // Add `claimed_eval` to the query
                query.push(*claimed_eval)
            }
            Oracle::CommittedNext { index } => {
                // Use `index` and `claimed_eval` to update the AIR constraints taking into
                // account that it is the shift of column `index`
                // ...

                // Add `claimed_eval` to the query
                query.push(*claimed_eval)
            }
            Oracle::TransparentOracle { transparent_oracle } => {
                let res = match transparent_oracle {
                    TransparentOracle::EQ { x: at } => {
                        EqFunction::new(at.to_vec()).evaluate(&eval_point)
                    }
                    TransparentOracle::Periodic { period, values } => {
                        // TODO: the following needs to be double checked

                        assert_eq!(*period, values.len());
                        // Get the evaluation point for the MLE of the periodic values
                        let ev_pt = (&eval_point[..*period]).to_vec();
                        // Build the MLE
                        let periodic_mle = MultiLinearPoly::from_evaluations(values.to_vec())
                            .expect("failed to construct periodic MLE");

                        // Evaluate the MLE
                        periodic_mle.evaluate(&ev_pt)
                    }
                };

                // Check that the claimed evaluation matches the computed one
                assert_eq!(res, *claimed_eval);

                // Add `claimed_eval` to the query
                query.push(*claimed_eval)
            }
        }
    }
    (query,)
}

The above doesn't include the logic related to the s-column but should be straightforward to add it. Given the above, and a way to describe the core relations of the virtual bus (i.e., the fractions), we should be able to detach most of the logic that is currently part of the VM and transfer it to Winterfell.

bobbinth commented 4 months ago

Thank you! My understanding is that what's currently missing are two things:

  1. Ability to pass shifted and periodic elements of a query into evaluate_fractions_at_main_trace_query() function.
  2. Ability to specify shifted and periodic columns for the final circuit layer (e.g., here).

If I'm understanding the proposal correctly, it is mostly aimed at addressing the first issue above (i.e., computing the query values) - right? And the way it would work is roughly as follows:

First, we define a method on the Air trait that would return a GkrOracle struct - it could look something like this:

trait Air {
    ...
    fn get_gkr_oracle(&self) -> GkrOracle {
        unimplemented!()
    }
    ...
}

By default, this method would be unimplemented, but when a specific prover would use GKR-LogUp, the user would implement this method and would statically specify a set of oracles to be returned (in this context, it is not clear to me how TransparentOracle::EQ oracles are to be built).

Then, based the returned GkrOracle, Winterfell would build queries for each trace row via the generate_query() function described above. However, it is not clear to me what are the eval_point and claimed_evaluations parameters (or rather how they are computed). My understanding is that the only things that we need to build a query are the main trace and the definitions of periodic values. So, shouldn't generate_query() function look more like this?:

pub fn generate_query<E: FieldElement>(
    gkr_oracle: GkrOracle<E>,
    main_trace: &ColMatrix<Felt>,
    periodic_column_mls: Vec<E>,
    row_index: usize,
) -> Vec<E>

Another potential concern is the efficiency of the generate_query() function. My understanding is that it needs to be called for every row in the trace, and so, having generic logic there (i.e., a match statement inside a loop) may slow things down somewhat.


A potential alternative approach could be to define something like a GkrLogUpEvaluator trait. This trait could look something like this:

pub trait GkrLogUpEvaluator {

    type BaseField: StarkField;

    /// Evaluates a query defined by the provided main trace frame and writes the results into the
    /// numerators and denominators.
    ///
    /// Query values corresponding to periodic columns are built internally from the periodic column
    /// definitions specified in this evaluator.
    fn evaluate_query<F, E>(
        &self,
        frame: &EvaluationFrame<E>,
        logup_rands: &[E],
        numerator_wires: &mut [E],
        denominator_wires: &mut [E],
    ) where
        F: FieldElement<BaseField = Self::BaseField>,
        E: FieldElement<BaseField = Self::BaseField> + ExtensionOf<F>;

    /// Gets a list of all oracles involved in LogUp-GKR; this is intended to be used in construction of
    /// MLEs.
    fn get_oracles(&self) -> Vec<LogUpGkrOracle<Self::BaseField>>;
}

/// A slightly simplified version of `Oracle` with `TransparentOracle::QE` removed.
pub enum LogUpGkrOracle<E: StarkField> {
    CurrentRow(usize),
    NextRow(usize),
    PeriodicValue(Vec<E>),
}

Then, we'd add a method to build GkrLogUpEvaluator in the Air trait like so:

trait Air {
    ...
    type GkrLogUpEvaluator: GkrLogUpEvaluator<BaseField = Self::BaseField>;

    fn new_gkr_logup_evauator(&self) -> Self::GkrLogUpEvaluator {
        unimplemented!()
    }
    ...
}

Assuming I didn't miss anything major above, the main drawback of this approach that the implementor is responsible for keeping GkrLogUpEvaluator::evaluate_query() and GkrLogUpEvaluator::get_oracles() methods consistent. The benefit is that evaluate_query() should be pretty efficient since it is "hand-written".

bobbinth commented 4 months ago

Thinking a bit more about it, maybe an even simpler approach is to do something similar to what we do with auxiliary trace. Specifically:

First, we modify the AirContext struct to return the list of columns involved in GKR-LogUp. This could look like so:

pub struct AirContext<B: StarkField> {
    ...
    /// This would replace the current lagrange_kernel_aux_column_idx field
    pub(super) logup_gkr_oracles: Vec< LogUpGkrOracle<B>>,
    ...
}

impl<B: StarkField> AirContext<B> {
    ...

    pub fn with_logup_gkr(
        trace_info: TraceInfo,
        transition_constraint_degrees: Vec<TransitionConstraintDegree>,
        num_assertions: usize,
        logup_gkr_oracles: Vec<LogUpGkrOracle<B>>,
        options: ProofOptions,
    ) -> Self {
        todo!("implement")
    }

    pub fn logup_gkr_oracles(&self) -> &[LogUpGkrOracle<B>] {
        &self.logup_gkr_oracles
    }

    pub fn has_logup_gkr(&self) -> bool {
        !logup_gkr_oracles.is_empty()
    }

    ...
}

Then, in the Air trait, we get rid of GkrProof and GkrVerifier associated types, and instead add a single method for evaluating

pub trait Air {
    ...
    /// Evaluates a query defined by the provided main trace frame and writes the results into the
    /// numerators and denominators.
    fn evaluate_logup_gkr_query<F, E>(
        &self,
        frame: &EvaluationFrame<F>,
        periodic_values: &[F],
        logup_rand_elements: &[E],
        numerators: &mut [E],
        denominators: &mut [E],
    ) where
        F: FieldElement<BaseField = Self::BaseField>,
        E: FieldElement<BaseField = Self::BaseField> + ExtensionOf<F>
    {
        unimplemented!("evaluation of LogUp-GKR queries has not been implemented");
    }
    ...
}

The benefits/drawbacks are the same as in the approach described in the previous post but this more closely follows the existing patterns in Winterfell.

Two outstanding questions in my mind:

  1. Can we provide periodic values into this function as we do for Air::evaluate_transition() and Air::evaluate_aux_transition()? I think so, but would be good to confirm.
  2. Assuming that the s column will be "hidden" from the user, what is the best way to specify "boundary conditions" for LogUp-GKR?
Al-Kindi-0 commented 4 months ago

The issue was intended to address the last part of the GKR protocol, namely the part related to the opening proof which is handled using the two auxiliary trace columns. As there is a significant overlap between solving the aforementioned issue and the bigger issue of integrating LogUp-GKR into Winterfell, I think we can try to address this later one in this issue as well.

First, we can for the time being forget about the EQ entry. Once we finish with the current roadmap, we can start refactoring the GKR protocol so that we can not only clean up the way EQ is handled but also the way I believe we should treat periodic columns for the time being.

I am leaning towards the second proposal with a slight modification. Instead of specifying the evaluate_logup_gkr_query method we would need to specify evaluate_fractions_at_main_trace_query.

Now using logup_gkr_oracles, we can specify a few methods which, given an EvaluationFrame and periodic values, can build the appropriate input to evaluate_fractions_at_main_trace_query. Thus, evaluate_logup_gkr_query is well defined from logup_gkr_oracles and evaluate_fractions_at_main_trace_query.

To make this work, we would need:

  1. To assume that evaluate_fractions_at_main_trace_query and logup_gkr_oracles are in sync.
  2. That the logup_gkr_oracles are ordered, first by kind as in gkr_query and second by id i.e., index within the main trace.

The main downside of the above is that periodic columns will have to be part of Vec<MultiLinearPoly<E>> for the input here. This is unoptimal, as the MLE and its folds will have a repeating structure, but not by much as the main extra cost comes from storing the fully materialized MLE and folding by randomness both of which are almost negligible when compared to the rest.

Al-Kindi-0 commented 4 months ago

Another potential concern is the efficiency of the generate_query() function. My understanding is that it needs to be called for every row in the trace, and so, having generic logic there (i.e., a match statement inside a loop) may slow things down somewhat.

generate_query is only relevant to the checking the final evaluation claims and thus is called only once by the verifier. I changed the name of that method to make it clearer.

Assuming that the s column will be "hidden" from the user, what is the best way to specify "boundary conditions" for LogUp-GKR?

The boundary conditions for LogUP-GKR should be computed from the public inputs using a method which we should probably add to AirContext. Assuming the s columns is later removed, we would need only boundary constraints against the Lagrange kernel. The constraints against the s columns will be, from the perspective of the verifier, just evaluating an expression during constraint evaluation at the OOD point.