osu-crypto / libPSI

A repository for private set intersection.
Other
172 stars 48 forks source link

cuckoo stash overflow #26

Closed Cryptographer63 closed 2 years ago

Cryptographer63 commented 2 years ago

I'm using this open source library to test 1 million pieces of data by executing . /frontend.exe -r 1 -kkrt -in . /alice.csv, I get the following error:

cuckoo stash overflow Exception: /home/crypto/libOTe/cryptoTools/cryptoTools/Common/CuckooIndex.cpp:460

ladnir commented 2 years ago

The most likely cause is duplicates in your input set.

Cryptographer63 commented 2 years ago

duplicates

What is the maximum amount of data supported by this library. When the data volume is 1 million items, because I found that when using ecdh, it is able to run with the same data set, but it will be very slow, and if I switch to kkrt it will report that error above.

If I reduce the data volume to 100,000 items, then there will be no problem.

ladnir commented 2 years ago

It's possible that the ecdh would work correctly if your input sets have duplicate. However, this is not the case with kkrt. You as the user have to make sure your input sets only have unique values.

The protocols should work until it your computer runs out of memory. Maybe 20 million for a 8gb ram machine for kkrt.

Cryptographer63 commented 2 years ago

It's possible that the ecdh would work correctly if your input sets have duplicate. However, this is not the case with kkrt. You as the user have to make sure your input sets only have unique values.

The protocols should work until it your computer runs out of memory. Maybe 20 million for a 8gb ram machine for kkrt.

Ok, I understand, actually I found another problem when testing, for example, I tested two csv's containing two items that were only partially the same, but were considered the same. I don't know if this has anything to do with the dataset, I generated the 100,000 random strings myself.

ladnir commented 2 years ago

I'm not sure but maybe there are duplicates in your sets. As a test, maybe make your sets {0,1,2,...,n } and see if that gives you the correct result. It's possible there is a bug in the file parsing code but I think the core library is correct.