Currently expensive and it doesn't have to be: computing \tA(r, x) for fixed r (the randomness chosen during the first sum-check) as x ranges over {0, 1}^{log n} where n is number of columns.
Because A has so much repeated structure, computing this should be super fast.
Other than an edge case at the very end, you can index rows by (constraint, timestep) and columns by (value, timestep).
For i=(constraint, timestep) and j = (value, timestep'):
\tA(i, j) is just smallA(constraint, value) * eq(timestep, timestep'),
where smallA captures the constraints just for one timestep.
Need a table of eq(r_timestep, xtimestep) values as x{timestep} ranges over {0, 1}^{log num_timesteps}.
So you just do the expensive vector-matrix multiplication to compute smallA(r_constraint, xvalue) as x{value} ranges over {0, 1}^{log 100} where 100 is the number of entries in z per step of the CPU.
Now each entry of A(r, x) can be filled in with one mult: smallA(r_constraint, x_value) * eq(r_timestep, x_timestep).
Currently expensive and it doesn't have to be: computing \tA(r, x) for fixed r (the randomness chosen during the first sum-check) as x ranges over {0, 1}^{log n} where n is number of columns.
Because A has so much repeated structure, computing this should be super fast.
Other than an edge case at the very end, you can index rows by (constraint, timestep) and columns by (value, timestep).
For i=(constraint, timestep) and j = (value, timestep'):
\tA(i, j) is just smallA(constraint, value) * eq(timestep, timestep'),
where smallA captures the constraints just for one timestep.
Write r=(r_constraint, r_timestep), x=(x_value, x_timestep). Then A(r, x) = smallA(r_constraint, x_value) * eq(r_timestep, x_timestep).
Need a table of eq(r_timestep, xtimestep) values as x{timestep} ranges over {0, 1}^{log num_timesteps}.
So you just do the expensive vector-matrix multiplication to compute smallA(r_constraint, xvalue) as x{value} ranges over {0, 1}^{log 100} where 100 is the number of entries in z per step of the CPU.
Now each entry of A(r, x) can be filled in with one mult: smallA(r_constraint, x_value) * eq(r_timestep, x_timestep).