Aircloak / aircloak

This repository contains the Aircloak Air frontend as well as the code for our Cloak query and anonymization platform
2 stars 0 forks source link

Prevent ranges through string manipulations #1956

Open yoid2000 opened 7 years ago

yoid2000 commented 7 years ago

ready to read

yoid2000 commented 7 years ago

Prevent ranges through string manipulations

This issue is forked from #1950.

The expanded definition of range includes for instance round() and trunc(). It is possible to emulate these functions by casting a number to a string, and then manipulating the string. For instance:

cast(
  concat(
    left(
      cast(column
      AS text),
    -1),
  '0')
AS integer)

Is the same as trunc(column, -1). (Noting that currently we don't allow truncating to the nearest 10, but we should, see #1953.)

Of course, it could also be that the column is natively a text column even though the text consists of numbers. In that case, the following is effectively a trunc(col, -1):

cast(
  concat(
    left(column,
    -1),
  '0')
AS integer)

On the assumption that there are other ways to do this as well, then to prevent this we need to disallow the use of any string functions on a column prior to that column being cast into a number (integer or real).

Alternatively we could try to think of all the ways that one could implement trunc() or round(), and only disallow these, but this seems like a challenge.

Note I don't see a need for this restriction with boolean because one can't effectively make a range out of a boolean. It is either 0 or 1.

Note that for all I know this restriction is already in place. But now the reason for it is documented.

sebastian commented 7 years ago

But now the reason for it is documented.

Not yet. It needs to be properly encoded in the code. And will we have an overarching design doc somewhere in the end? Like an extension to the anonymization doc?

yoid2000 commented 7 years ago

And will we have an overarching design doc somewhere in the end?

We certainly should. It's been a little painful keeping anonymization.md up to date, so we should discuss best what form this documentation takes. The mindmaps may play a role...