Aircloak / aircloak

This repository contains the Aircloak Air frontend as well as the code for our Cloak query and anonymization platform
2 stars 0 forks source link

Additional samples in boolean conditions #2392

Open obrok opened 6 years ago

obrok commented 6 years ago

For boolean columns there are multiple ways of expressing many conditions. For example these are functionally the same:

column = false 
column <> true
column >= false and column < true
column between false and true

Given our rules for noise layers there will be 3 different samples generated for these queries. This all assumes that there are no NULL values in the column, otherwise the different ways will not be equivalent.

obrok commented 6 years ago

@yoid2000 ^

cristianberneanu commented 6 years ago

This would all be fixed by disallowing comparisons for boolean and implementing support for IS TRUE / IS FALSE. As an alternative, more complex normalization rules might work as well.

sebastian commented 6 years ago

Tableau uses WHERE booleanFlag for WHERE booleanFlag = true, so that's needed too, and based on that IS TRUE/FALSE won't be sufficient.

yoid2000 commented 6 years ago

good catch @obrok

Yes, would be good if we could add noise consistently. But lets not worry about that immediately. I'm playing around with some ideas for getting rid of most floating that may make noise consistent automatically as a side effect.