LGro / PyAPSI

Python wrapper for labeled and unlabeled asymmetric private set intersection (APSI).
MIT License
14 stars 6 forks source link

Add load csv db and batch add labeled items. #18

Open xiaohan2909 opened 1 year ago

xiaohan2909 commented 1 year ago

Hello, I made some changes to this project, they are as follows:

  1. add a "load_csv_db" method using some source file from the origin apsi cli folder.
  2. finish a TODO item, the add_items function in LabeledServer now use a batch function in c++.

Cautious: I changed the CmakeList.txt for some new source files were added, but some configs may become different from old, so you may need merge it by hand or repeat this file with your own version.

LGro commented 1 year ago

Hi @xiaohan2909, thanks for reaching out. I'm curious to dive into your proposed changes. Would you meanwhile be open to elaborate a bit on what motivated the changes; Specifically, why you were missing an option to load a CSV file as a database? Seeing that this contribution round about doubles the size of this repo's code base, I'd like to understand the motivation thoroughly :relaxed:

xiaohan2909 commented 1 year ago

Thank you for your reply. I really should explain the reason. In some situations, I need to directly import a large amount of data into SenderDB, which is saved in a CSV file. Due to cross language reasons, the batch addition method(add_items) in the project requires reading the data into a list in Python first, and then converting types before using the methods in C++ to import. So I added the CSV import method in the hope of achieving a more efficient and direct import of a large amount of data. Of course, this function seems a bit redundant because the batch method in the project is sufficient to complete this function. So, I hope to use this submission as a supplement or reference, perhaps a better way is to optimize the batch addition method itself.