bit-ml / Private-Set-Intersection

MIT License
52 stars 7 forks source link

We implemented in Python a Private Set Intersection (PSI) protocol, a functionality that allows two parties to privately join their sets in order to compute their common elements. In our setup, these parties are: ​

Disclaimer: Our implementation is not meant for production use. Use it at your own risk. ​

Main idea of the protocol

​ Suppose the client wants to check if his query x belongs to the database of the server. Consider this database having as entries the integers y_1,...,y_n. Then the server can associate to its database the following polynomial P(X) = (X-y_1) ... (X-y_n). If x belongs to the database, then P vanishes at x. This is the main idea of the protocol: the server should compute this evaluation of P at x in a secure way and send it to the client. ​

More precisely, the client sends his query encrypted with a homomorphic encryption scheme, Enc(x). Then the server evaluates P at the given encryption: due to homomorphic properties, the result will turn out to be Enc(P(x)). Then the client decrypts this result and checks if it is equal to 0; in case of equality, the query belongs to the database. ​

The protocol is split in two parts: offline phase and online phase. The online phase starts when the client performs the OPRF protocol with the server, for encoding its items with the server's secret key. ​

The preprocessing phase

In this phase, both the server and the client preprocess their datasets, until performing the PSI protocol. ​

The actual PSI protocol

​ In this phase, both the server and the client perform the actual PSI protocol. The encryption scheme used, [FV], allows encrypting messages as polynomials of degree less than poly_modulus_degree, which is a power of 2, with integer coefficients modulo plain_modulus. This modulus is chosen so that it is a prime congruent with 1 modulo 2 poly_modulus_degree, which helps identifying each such polynomial with a vector of poly_modulus_degree integer entries modulo plain_modulus. TenSEAL allows encryption of vectors of integers, by first performing the above correspondence and then performing the actual encryption. Also, in a similar way, decrypting in TenSEAL works for vectors of integers. The encryption scheme implemented in TenSEAL benefits from allowing to encrypt vectors of integers, by performing both encoding and encryption. Also, decrypting in TenSEAL leads to vectors of integers*, the encodings of the corresponding decryptions. ​

Since each minibin is represented by a polynomial of degree D =bin_capacity/alpha, evaluating such a polynomial can be performed by doing the scalar product between the vector of its coefficients and all the (encrypted) powers of the x, with exponent at most D. The next step, windowing, helps the client send sufficiently many powers so that the server can recover all the powers with small computational effort. ​