pepper-project / pequin

A system for verifying outsourced computations, and applying SNARKs. Simplified release of the main Pepper codebase.
Other
122 stars 46 forks source link

Constraint Explosion when accessing arrays in slightly different ways #35

Open fleupold opened 5 years ago

fleupold commented 5 years ago

Assuming the following example app:

#include <stdint.h>

struct In { uint32_t bits[1024]; };
struct Out { uint32_t sum; };

void compute(struct In *input, struct Out *output){
    uint32_t index, result = 0;
    uint32_t *bits = input->bits;
    for (index = 0; index < 1024; index+=32) {
        result += sumBits(bits + index, 0, 32); // NUMBER OF CONSTRAINTS: 437170
        //result += sumBits(bits, index, 32); // NUMBER OF CONSTRAINTS: 2
        //result += sumBits(input->bits + index, 0, 32); // NUMBER OF CONSTRAINTS: 2
    }
    output->sum = result;
}

uint32_t sumBits(uint32_t *bits, uint32_t offset, uint32_t length) {
    uint32_t result, index = 0;
    uint32_t pow = 1;
    for (index=0; index<length; index++) {
        result += bits[offset+index] * pow;
        pow = pow * 2;
    }
    return result;
}

This app interprets the values in an input array as 32bit encoded numbers and adds them together (just an example).

Inside the for loop of the compute function, I'm accessing the input array in three slightly different but as far as I can tell semantically equivalent ways. In the first option, I offset the input array outside the sum bit function to pass in the correct starting bit for the next number (bits + index). This results in a circuit with a very large number of constraints (~430k). Interestingly passing the offset as an extra parameter into the sumBit function and applying it only when accessing the value bits[offset+index] leads to only 2 constraints. Even more interestingly, in the third case we don't use a local variable, but access the input struct directly (input->bits). There it doesn't seem to matter if we apply the offset outside or inside of sumBits. The number of constraints is always small.

I would like to understand what's going on under the hood here. Is this behaviour expected?

What difference does it make to the compiler if I declare a type as uint32_t *bits vs uint32_t bits[1024] (I would expect the length to be inferred at compile time)?

maxhowald commented 5 years ago

This is not expected behavior; it looks like you are running in to a bug in the frontend compiler somewhere, perhaps the optimizer.

A good place to start is probably to compare the generated .pws, .circuit, and .spec files for each case. For the case where only 2 constraints are generated, can you tell that they actually compute the right thing? From there, you may be able to work backwards to figure out how the .circuit file is being generated.

fleupold commented 5 years ago

Inside the generated .pws, .circuit, and *.spec file, I can see that the large constrain system is doing a lot of RAMPUT/RAMGET operations like:

RAMPUT_FAST ADDR 0 VALUE I0 CONDITION 1 true V1043
RAMGET_FAST ADDR 0 VALUE V1047

whereas the small constraint system just has one large polynomial that is adding all values at once (it does produce the correct output though).

I found that in the small systems my code is compiled to ArrayAccessOperators whereas in the large system the access is compiled to a PointerAccessOperator. Inside PointerAccessOperator we invalidate the cache, leading to a bunch of Ramput/Ramget operations being written (I assume these cause the explosion down the line).

I also tried replacing the main line of interest with

result += sumBits(&(bits[index]), 0, 32);

which should compile as ArrayAccess and be equivalent to bits + index. However the compiler complains that:

Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 1
    at SFE.Compiler.Pointer.access(Pointer.java:107)
    at SFE.Compiler.Operators.PointerAccessOperator.resolve(PointerAccessOperator.java:72)
    at SFE.Compiler.Operators.PointerAccessOperator.resolve(PointerAccessOperator.java:37)
    at SFE.Compiler.FloatConstant.toFloatConstant(FloatConstant.java:112)
    at SFE.Compiler.Operators.TimesOperator.resolve(TimesOperator.java:81)
    at SFE.Compiler.FloatConstant.toFloatConstant(FloatConstant.java:116)
    at SFE.Compiler.Operators.PlusOperator.resolve(PlusOperator.java:105)
    at SFE.Compiler.Expression.fullyResolve(Expression.java:105)
        ...

The reason, I'm using pointers instead of arrays in the first place, is that I have some helper functions which are used from multiple main compute functions (e.g. decompose a field into it's bit elements or sum up an array of bits to a field element), but each program calls it with differently sized arrays.