Where does the linear code algorithm come from?

maths644311798 commented 10 months ago

In linear_code.h, Yacl refers to The minimum locality of linear codes and other links. This paper proves many results about locality. Yacl implements which algorithm? Can it be pointed out clearly?

liang-xiaojian commented 10 months ago

The file linear_code.h implements a d-local-linear-code for the Ferret Oblivious Transfer (OT) extension, as described in Appendix A.2 of https://eprint.iacr.org/2020/924.pdf. Moreover, the majority of the implementation is based on the code from https://github.com/emp-toolkit/emp-ot/blob/master/emp-ot/ferret/lpn_f2.h. :)

maths644311798 commented 10 months ago

I conclude the implementation here. Over $F_2$, the generator matrix $G$ satisfies the definition of $d$-local linear codes. The generator matrix $G$ has several groups of $1s$. Each group contains 64 or 128 $1s$ and the $1s$ lie in a slash. For example,

<> <> 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 -- | -- | -- | -- | -- | -- | -- | -- | | | | | | | 1 | | | | | | | | 1 | | | 1 | | | | | … | | | 1 | | 1 | | | 1 | 1 | | … | | 1 | | | | 1 | | 1 | | … | | | | … | | | | 1 | | | | 1 | | | | | | |

In the encode functions like Encode(absl::Span<const uint64_t> in, absl::Span<uint64_t> out), the sentence auto val = out[i + j]; has some problems. If absl::Span<uint64_t> out is not set to zero, then its value influence the result. I think the initial value for val should be 0.

liang-xiaojian commented 10 months ago

Each group contains 64 or 128 $1s$ and the $1s$ lie in a slash.

Actually, each group would choose the non-zero entries for generator matrix by random permuation (AES), which could be found at https://github.com/secretflow/yacl/blob/main/yacl/crypto/primitives/code/linear_code.h#L243.

If absl::Span<uint64_t> out is not set to zero, then its value influence the result. I think the initial value for val should be 0.

Encode(absl::Span<const uint64_t> in, absl::Span<uint64_t> out) would perform the operation out = (in * G) ^ out. Consequently, we could compute the output vector for "LPN assumption" as followed:

// implement1
// avoid memory allocation
auto& output = noise;         
// output = secret * matrix + noise
llc.Encode(secret, noise);

If Encode(absl::Span<const uint64_t> in, absl::Span<uint64_t> out) would force to set out to be zero, we have to compute the "LPN" as shown below, which requires extra memory allocation and initialization:

// implement2
// allocation && initialization
std::vector<uint128_t> output(n, 0);    
 // output = secret * matrix
llc.Encode(secret, output);                   
// output = output + noise = secret * matrix + noise
std::transfrom(noise.begin() , noise.end() , output.begin(), std::bit_xor());

FYI, the LPN parameters in Ferret OT extension(table 2) require $n \approx 10,000,000$. As a result, implement2 need extra 30ms just for memory initialization.

maths644311798 commented 10 months ago

I understand these two implements now. The name Encode(...) just causes some misunderstandings since it is not used for a purely linear encoding.

secretflow / yacl

Where does the linear code algorithm come from? #223