Open yoid2000 opened 4 years ago
Similarly x or (a and b and c) can be made with (x or a) and (x or b) and (x or c), and so on.
If the conditions are normalized, why does this matter?
To prevent semantically identical conditions using BETWEEN (i.e. age = 10 and age BETWEEN 10 and 11), we can have a rule that if a BETWEEN expression, when applied to the shadow table, returns a single value, then we either reject the query or rewrite the BETWEEN as the equivalent equality.
This seems like a poor heuristic, especially with the low number of entries in the shadow table.
If the conditions are normalized, why does this matter?
I'm not sure if I'm being clear. My point is that we want to disallow x or (a and b and c)
, so that means we need to disallow it in whatever form it appears in.
If what we do is normalize AND'd OR groups into OR'd AND groups, and then disallow any expression that results in an OR'd AND group, then that would be fine.
This seems like a poor heuristic, especially with the low number of entries in the shadow table.
Perhaps so. We could work on it a bit if we choose to go this route.
At this point I'm very worried about LED performance (as well as the difficulty of implementing). @sebastian is talking about implementing it through emulation, which is going to kill performance and may end up being a non-starter. I'd really like to find a workable alternative...
It seems to me we can implement a safe simple OR with restrictions along the lines of what we have for simple CASE.
The basic allowed structure would be:
In other words, groups of OR'd conditions (OR groups) separated by AND
The restrictions would be as follows:
col = val
col BETWEEN val1 AND val2
col <> val
age = 10
andage BETWEEN 10 and 11
orgender = 'M'
andgender <>
F`. We would need to recognize this.In the implementation, noise layers would be seeded from inspection --- no floating necessary. The range expression would have a single static noise layer, the others would have two noise layers (static and uid).
The reason for the fourth restriction is that we want to prevent expressions of the sort:
where
a
,b
, etc. are the simple expressions.The reason for this is that
(b AND c AND ...)
serves as pseudo-identifier even when the constants inb
,c
, etc. are shadow constants. If we allow an expression to exist in every OR group, then an attacker can generate the above expression with:Similarly
x or (a and b and c)
can be made with(x or a) and (x or b) and (x or c)
, and so on.By contrast, if we prevent a given simple expression from being in every OR group, then I don't see how an attack can be formed. In other words, I don't see how an attacker can make two queries that have identical conditions except for the pseudo-identifier.
To prevent semantically identical conditions using
BETWEEN
(i.e.age = 10
andage BETWEEN 10 and 11
), we can have a rule that if aBETWEEN
expression, when applied to the shadow table, returns a single value, then we either reject the query or rewrite theBETWEEN
as the equivalent equality.To prevent semantically identical conditions using
<>
(i.e.gender = 'M'
andgender <>
F`), we can have a rule that if there are only two values in the shadow table, then the negand is re-written to the corresponding equality.