Aircloak / aircloak

This repository contains the Aircloak Air frontend as well as the code for our Cloak query and anonymization platform
2 stars 0 forks source link

Can we allow selecting equalities? #2417

Open sebastian opened 6 years ago

sebastian commented 6 years ago

Dan from Tableau sent me a set of queries that fail with our system. Amongst them is the following one which would work if it wasn't for the selected equality on line 2:

SELECT 
  (CAST("games"."outcome" AS TEXT) = 'W') AS "Calculation_834854816085528576",
  CAST("games"."outcome" AS TEXT) AS "outcome",
  SUM(1) AS "sum:Number of Records:ok"
FROM "games"
GROUP BY 1, 2
HAVING (SUM("games"."losepoints") >= 1000)

Basically it selects true for win outcomes and false for loss or tie:

--------------------------------+---------+--------------------------
 t                              | W       |                    51963
 f                              | L       |                    44158
 f                              | T       |                     4240
(3 rows)

The question is whether this is something we can allow? Furthermore the question is how high the utility of allowing something like this is. It gives us benefits in improved Tableau support, that's for sure, but what, if anything, does it cost us in terms of privacy? It does allow for alternative groupings of values so to speak, but doesn't seem to have a particularly great power.

@yoid2000, what do you think?

obrok commented 6 years ago

Allowing boolean expressions to be selected is dangerously close to allowing the analyst to implement their own logic

yoid2000 commented 6 years ago

Yapee makes an interesting point. To subvert boolean logic there would have to be some math after the expression. I don't suppose that that math would trigger the bad math filter? Assuming not, maybe what we need to do is treat this kind of Boolean expression as dangerous math?

PF

On Wed, Feb 14, 2018 at 10:55 AM Paweł Obrok notifications@github.com wrote:

Allowing boolean expressions to be selected is dangerously close to allowing the analyst to implement their own logic

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/Aircloak/aircloak/issues/2417#issuecomment-365552103, or mute the thread https://github.com/notifications/unsubscribe-auth/ACD-qcGN2OnrSJgbR2qrqSJPxOcgH3ETks5tUq17gaJpZM4SE6_y .

sebastian commented 6 years ago

Ok, not for this release in any case.

cristianberneanu commented 6 years ago

This doesn't work for text columns, but for numeric types one can do: CAST(col - value AS boolean) to work around this restriction.

obrok commented 6 years ago

This doesn't work for text columns, but for numeric types one can do: CAST(col - value AS boolean) to work around this restriction.

We would have a real problem if the analyst would be able to manipulate a couple such values, for example with * to create a logical AND:

select 
  cast(cast(age - 18 as boolean) as integer) *
    cast(cast(level - 3 as boolean) as integer)
from players group by 1

Currently this particular query is forbidden:

image

But we should be on the lookout for equivalent formulations of this.