datasciencecampus / pprl_toolkit

The privacy-preserving record linkage toolkit: a proof-of-concept public demo of next-gen data linkage techniques.
https://datasciencecampus.github.io/pprl_toolkit/
MIT License
6 stars 1 forks source link

Are global bucket names an issue? #29

Closed matweldon closed 5 months ago

matweldon commented 5 months ago

Users will create buckets for each party called f"{party_name}-bucket"

We're relying a lot on users being creative at naming buckets.

Is there some way we could generate a short hash which is generated (for example) the first time they run scripts/01... for each party, but is not regenerated if they run the script again. Then use that hash in the bucket name?

daffidwilde commented 5 months ago

I suppose we could do something like hash the .env file, but then what if the same group of users wanted to do another linkage project? They'd have to come up with different names for their projects, I guess.

I still think it's easier for everyone if we tell users to decide on a hash before starting and use that to name their projects. Happy to keep discussing this though!