Closed obrok closed 5 years ago
I'm not immediately seeing the attack. Can you give an example? (I hope there isn't one, because it isn't so clear to me how to prevent overflows...)
Let's say the attacker knows that the victim is the oldest out of the patients in Cracow at 60 years old. Let's further assume that (2 ^ (64 * 60))
causes the overflow while (2 ^ (64 * 59))
does not. Then the attacker just compares:
SELECT AVG(2 ^ (64 * age)) FROM table WHERE city = 'Cracow' AND disease = 'cancer'
SELECT AVG(2 ^ (64 * age)) FROM table WHERE city = 'Cracow'
If both queries crash, then the attacker knows that the victim has cancer.
I see. So the attacker selects values of X and Y for (X^(Y*val))
to get the overflow at just the right place.
Ok, seems like a real attack to me....good catch
Any solutions come to mind? One halfbaked and complex-sounding idea would be to
NT
somewhere between the overflow
value and say 0.9 * overflow
, thenCmin
that would cause the threshold to be reached, thenWHERE column < Cmin
, thenset a noisy threshold NT somewhere between the overflow value and say 0.9 * overflow, then
You mean for every expression? It seems one would need to do this whole thing for every subexpression to be sure, for example for (X ^ (Y * val))
do one check for (Y * val)
and another for (X ^ (Y * val))
.
You mean for every expression? It seems one would need to do this whole thing for every subexpression to be sure, for example for (X ^ (Y val)) do one check for (Y val) and another for (X ^ (Y * val)).
Sheesh. Ok.
I don't have any good ideas :(
Man math kills us in so many ways.
Here is another thought. For each column, we store two numbers, one a bit higher than the true max, and one a bit lower than the true min (or in both cases the same as the max/min if there are > LCF instances). Then with any math expression, we first run the expression on the DB.
So, if the attacker has AVG(2 ^ (64 * age))
in the expression, and the highest age in the DB is 105, and so our test_max=109
, then we would try:
SELECT AVG(2 ^ (64 * 109))
on the DB. (and a similar thing for the min)
If there is no overflow, then we regard the thing as safe and run the query. If there is an overflow, we supply the appropriate error message.
Man math kills us in so many ways.
Well, math is Turing-complete, so that's hardly surprising.
Here is another thought...
Makes sense, although I bet this can be circumvented by a clever expression that maxes out for non-max values of age
. It could be tightened somewhat by storing 100 or so values between noisy_max and noisy_mean. I guess our expression limit will prevent too complicated expressions, but I'm not at all confident it's impossible to construct an expression of 5 mathematical operators that has a very sharp peak at a specified point.
Ok I changed the title to make it sound more scary.
This is a problem we encountered in various forms and discussed about before, but with no elegant solution in sight.
Here is another thought. For each column, we store two numbers, one a bit higher than the true max, and one a bit lower than the true min (or in both cases the same as the max/min if there are > LCF instances). Then with any math expression, we first run the expression on the DB.
Doesn't look too sound to me either. An expression like 1/(col - X)
maximizes around X
.
It seems to me that the root of the issue is that some offloaded math operations raise an exception on error instead of returning NULL
, like all our cloak-emulated functions do.
We could try to check the input in the offloaded SQL code. An expression like 1/x
would become 1/NULLIF(x==0, x)
.
Another option would be to implement our own math operators inside the database, using the backend's custom language for doing that. If we go that route, at some point we can also start implementing various anonymizer steps in it, offloading more work to the database at the cost of increased complexity in our integration code.
Is it currently the case that we don't check for divide-by-zero? I had supposed that we do.
We only check in the cloak, offloaded expressions are executed in the database.
ok, but I see that the cloak checks for expressions that could cause divide-by-zero and reject such queries before accessing the database?
Here is another thought. For each column, we store two numbers, one a bit higher than the true max, and one a bit lower than the true min (or in both cases the same as the max/min if there are > LCF instances). Then with any math expression, we first run the expression on the DB.
Doesn't look too sound to me either. An expression like 1/(col - X) maximizes around X.
@cristianberneanu it seems that the cloak doesn't allow an expression like 1/(col-X)
in the first place. So an expression that peaks at some value in the middle would have to be created some other way. Maybe @obrok is right that there is some way to do it (with 5 operators) but it isn't obvious what that is.
Which suggests to me that the idea of trying the math expression in the DB on max and min values is worth doing. That eliminates at least the low-hanging fruit of attacking a user with a high value.
@cristianberneanu it seems that the cloak doesn't allow an expression like 1/(col-X) in the first place
But it does allow 1/col
which peaks around 0.
But it does allow 1/col which peaks around 0.
I guess we couldn't see how that expression could be utilized in a divide-by-zero attack.
I guess we couldn't see how that expression could be utilized in a divide-by-zero attack.
But isn't it better to implement the protection using NULLIF
clauses for operations like /
or sqrt
?
That way, we can also eliminate the current restrictions that disallow any kind of math on the input columns for these functions.
But isn't it better to implement the protection using NULLIF clauses for operations like / or sqrt? That way, we can also eliminate the current restrictions that disallow any kind of math on the input columns for these functions.
Maybe, but I'm concerned now that if we allow more freedom in division, then someone can more easily build an overflow attack by allowing the division to get quite close to zero, but not exactly zero, so the NULLIF is not triggered...
The NULLIF
condition doesn't have to be value = 0
. It can also be something like value <= EPSILON AND value >= -EPSILON
, which should protect against both overflow and division by 0.
@cristianberneanu good point. Looks challenging but off hand don't see why it couldn't be made to work.
A problem I foresee with the NULLIF
approach is that it only works for the functions where we implement it. We would basically find ourselves in the situation of having to predict every possible error that can be caused by such operations. On the cloak side we catch all errors and return a null in that case, which gives us a better coverage/effort ratio.
Ok, there is a dead simple and extremely powerful version of this attack (with help from @obrok )
Here is the basic thing:
select count(*)
from accounts
where lastname = 'foo' and
2^(10000.01 * cli_district_id) = 123.12
If there is someone in the system with a matching lastname, then an overflow takes place. If there is nobody with that lastname, then the query passes, and the result is a count of 0. The above query passes, and the following query fails:
select count(*)
from accounts
where lastname = 'Zamora' and
2^(10000.01 * cli_district_id) = 123.12
Note that the smallest cli_district_id
is 1, so this query fails even when cli_district_id=1
(which is the case for 'Zamora'). The above query tells us that there is 1 or more Zamora in the database. But if you happen to know there is exactly one (which is the case here), then you can test for any additional condition you want. If the condition is False
, then the query passes, and if True
, then it fails.
So for instance, this query fails:
select count(*)
from accounts
where lastname = 'Zamora' and
gender = 'Female' and
2^(10000.01 * cli_district_id) = 123.12
so we know that Zamora is a female. Or, this query fails:
select count(*)
from accounts
where lastname = 'Zamora' and
birthdate = '1996-11-17' and
2^(10000.01 * cli_district_id) = 123.12
and now we know Zamora's birthday. (and note that we can do a simple search to narrow this down, like this fails:
select count(*)
from accounts
where lastname = 'Zamora' and
year(birthdate) = 1996 and
2^(10000.01 * cli_district_id) = 123.12
and we know Zamora's birth year. Then we find month, then day.
@cristianberneanu @obrok
How would you use NULLIF
to prevent for instance the following query from overflowing? I don't see it. (Or for that matter any solution other than that of testing the dangerous expressions with min
and max
column values....)
select count(*)
from accounts
where lastname = 'Zamora' and
year(birthdate) = 1996 and
2^(10000.01 * cli_district_id) = 123.12
First of all, NULLIF
doesn't work exactly how I thought it would and using CASE
seems like a better fit.
Secondly, protecting against overflow will be harder than protecting against divide by 0 or taking the square root of a negative number, as it can happen in a lot more scenarios.
In this case, the expression 2^(10000.01 * cli_district_id)
has to be offloaded as case when log(2, 1E+308) >= 10000.01 * cli_district_id then 2^(10000.01 * cli_district_id) else NULL end
.
Unfortunately, this is not at all limited to ^
- basically any operation can overflow or underflow in some way. For example these two queries:
select count(*)
from accounts
where lastname = 'Zamora' and
9223372036854775807 + cli_district_id = 0
select count(*)
from accounts
where lastname = 'foo' and
9223372036854775807 + cli_district_id = 0
Also reveal the existence of a Zamora.
Some alternative tricks we can try:
Force usage of numeric types in all mathematical expressions in order to increase the range of possible values.
Create custom math functions in database scripting language that catch all exceptions. Something like:
CREATE OR REPLACE FUNCTION safe_pow(a double, b double)
RETURNS double AS $$
BEGIN
RETURN a ^ b;
EXCEPTION
WHEN numeric_value_out_of_range THEN
RETURN NULL;
END;
$$ LANGUAGE plpgsql;
Create custom math functions in database scripting language that catch all exceptions. Something like:
Yeah, I think this approach has a much better chance of covering us fully. One downside is that we'd need to be have write privileges for databases even when analyst tables are disabled.
Two questions though.
Force usage of numeric types in all mathematical expressions in order to increase the range of possible values.
I don't see how this helps, because no matter how big we make the range, the attacker can still find a way to exceed it.
Regarding the custom math function, I gather the way it would work is that, once a custom function returns NULL
, the whole expression would behave as though the original data were NULL
? Something like that?
@cristianberneanu @obrok here's another question. To your knowledge, are there any expressions that can reliably cause a measurable delay in the computation? If so, then catching exceptions won't be good enough.
For instance, the following query (run on postgres, not the cloak) takes about 1sec if the first two expressions are True
, and 100ms otherwise. This is not a valid query in the cloak, but maybe there are valid queries that have the same conditional delay behavior???
select count(*)
from accounts
where lastname = 'Zamora' and
birthdate = '1996-11-17' and
cli_district_id IN (select distinct cli_district_id FROM transactions)
Ah, I have one. The following query takes noticeably longer if the conditions of the first join are True
:
select count(*)
from
(select uid
from accounts
where lastname = 'Zamora' and
birthdate = '1996-11-16')t1
join
(select distinct uid
from transactions)t2
on t1.uid = t2.uid
What does it do to performance?
I am not sure, but I expect it to be small, under 10%, way better than emulating in any case.
Are there DBs where we can't do this?
We won't be able to do this on mongo. But it seems that the mongo function don't raise exceptions, they either return null
, MAX_VALUE
or NaN
.
@obrok is right that there is some way to do it (with 5 operators) but it isn't obvious what that is.
This expression peaks around 50, is allowed and crashes on max value: 2^(1100 - abs(age - 50))
I don't see how this helps, because no matter how big we make the range, the attacker can still find a way to exceed it.
The attacker will need bigger and bigger constants in order to overflow some of the functions.
Regarding the custom math function, I gather the way it would work is that, once a custom function returns NULL, the whole expression would behave as though the original data were NULL? Something like that?
Yes. In database logic, a NULL in the input produces a NULL output (usually, the exception being some special, NULL-aware functions).
To your knowledge, are there any expressions that can reliably cause a measurable delay in the computation? If so, then catching exceptions won't be good enough.
We don't (yet) allow functions that take a variable amount of time to run (like sleep
) and neither do we allow subqueries in the WHERE
clause. Not sure what happens in your second example ... maybe PostgreSQL executes the JOIN
sequentially instead of concurrently?
Not sure what happens in your second example ... maybe PostgreSQL executes the JOIN sequentially instead of concurrently?
That seems to be the case. The attack doesn't work if I swap the order of the joins.
Note, however, that this is not true for the overflow attack. Whether the overflowing condition is first or last, the attack still works. I presume that postgres is doing something in its optimizer/planner to make this happen, but I don't know. Though it makes sense: put the simplest condition first...
Though it makes sense: put the simplest condition first...
Probably is more related to the condition using an index or not: a complex condition might not use a column index, even if one exists.
@obrok @cristianberneanu
For timing attacks, I think what we need in general is a way to force the DB to execute everything. Any thoughts on this? (Even my version of the overflow attack from above would not work if we were forcing the DB to do everything, though in the case of overflow I don't think that point matters because where are likely other ways of exploiting overflow.)
Just fyi, if my version of the overflow attack is modified as follows, then it fails. For once, math working for us!
select count(*)
from
(select case when lastname = 'Zamora' then 1 else 0 end as con1,
case when birthdate = '1996-11-16' then 1 else 0 end as con2,
case when 2^(1000.01 * cli_district_id) = 123.12 then 1 else 0 end as con3
from accounts)t
where (con1 * con2 * con3) = 1
(though to be clear @cristianberneanu 's solution is certainly more comprehensive and better. )
For the timing attack, I tried the following fix, which forces each join expression to return at least one row (returning a NULL value in this case). Doesn't help, because postgres is apparently smart enough to recognize that it can ignore a NULL output.
select count(*)
from
(select NULL as uid
UNION
select uid
from accounts
where lastname = 'Zamora' and
birthdate = '1996-11-16')t1
join
(select NULL as uid
UNION
select distinct uid
from transactions)t2
on t1.uid = t2.uid
This following version does work. Much more complex however because it means that the cloak needs to know of values that will fail to meet the join condition.
select count(*)
from
(select 1829384756473 as uid
UNION
select uid
from accounts
where lastname = 'Zamora' and
birthdate = '1996-11-16')t1
join
(select 48592203948 as uid
UNION
select distinct uid
from transactions)t2
on t1.uid = t2.uid
This following version does work. Much more complex however because it means that the cloak needs to know of values that will fail to meet the join condition.
Actually, not sure it is all that complex, because one could pick a large random number and with very high probability it will be unique. And if it isn't then only one row affected...
Getting back to the overflow problem, I don't see offhand why we can't implement a checker in the cloak. It would be a piece of software that searches for a peak in the math function output (a kind of binary search) and indicates whether the value hitting that peak causes an overflow. We run every math snippet through that and reject queries for which a reasonable value causes an overflow...
It would be a piece of software that searches for a peak in the math function output (a kind of binary search)
It's not a binary search. Binary search will only work for monotonic functions.
It's not a binary search. Binary search will only work for monotonic functions.
That's why I said "a kind of binary search" ;)
But anyway something that finds one or a few peaks....
"a kind of binary search"
IMO that's a misleading way to talk about it. A binary search only works in very specific conditions, most functions don't satisfy them.
ok fair enough.
On Fri, Mar 15, 2019 at 4:20 PM Paweł Obrok notifications@github.com wrote:
"a kind of binary search"
IMO that's a misleading way to talk about it. A binary search only works in very specific conditions, most functions don't satisfy them.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/Aircloak/aircloak/issues/3654#issuecomment-473327564, or mute the thread https://github.com/notifications/unsubscribe-auth/ACD-qS5BFIqjSi9kp-nhitKhzqBsekL1ks5vW7o9gaJpZM4bpF-H .
These will have to be addressed in some form and will delay the upcoming release.
That's why I said "a kind of binary search" ;) But anyway something that finds one or a few peaks....
This could be solved in a general way by computing the inflection points of the graph (the points where the first order derivative is zero) for the input range [min_value, max_value]. But that would require from us the capability to compute the derivative of an arbitrary expression and solve the resulting equation. This sounds complicated. There is also the case of expressions containing 2 or more columns. I think catching all exceptions in the database might be simpler.
These will have to be addressed in some form and will delay the upcoming release.
Which ones? The division by 0 and/or overflow issues? Or the timing attack issue? That latter will be much harder to solve.
Note: I moved this out of the current release and into a service release immediately following it. The reasoning being that it's unclear how much work is required for these fixes to be made, and we do not want to delay the current release indefinitely.
Regarding implementing user defined functions, @obrok writes:
Yeah, I think this approach has a much better chance of covering us fully. One downside is that we'd need to be have write privileges for databases even when analyst tables are disabled.
I think it's possible to create temporary functions? Maybe these can be created without write permissions?
This issue here now describes multiple distinct attacks:
As such it no longer works as an implementation issue. Going forward, please let's create fresh issues when finding other orthogonal problems!
I'll create a separate issue for the overflow, div null, and timing attack and close this one. It looks like our best bet at the moment is to create used defined functions in the DB, which would likely address overflow as well as div by null, but I am creating them as separate issues all the same in case we come up with different fixes.
The following query crashes (on the default dev database - data dependent):
Error:
I think this is a potential attack vector in the same vein as division by 0 in the database is - an attacker could deduce if a specific value matches some condition or not by setting the
64
value above so that it only overflows when computed for that specificcustomer_id
(or a larger one, but perhaps those can be excluded with some other condition).Ping @yoid2000