Document security expectations

koczkatamas commented 7 years ago

Currently I am not sure the Kaitai-generated code won't cause any security issues, so as a first step we should create a warning about this.

I have some ideas:

an attacker can create a specifically crafted .ksy file which can execute more privileged commands than the user would expect
- more concrete example: if we'll support direct WebIDE links to user created ksys, that ksy may call an opaque class called Function which actually calls Javascript's Function and executes the argument (this probably won't work as it would expect string), or use some built-in property eg. constructor to access security-critical property (eg. "".constructor.constructor("alert(1)")).
an attacker can cause buffer-overflow or Denial of Service attacks by supplying specifically crafted payload to our generated parser
- this may happen by causing integer overflow in expression and then abusing some parsing logic
- or allocating too big buffer which makes a server which uses Kaitai non-responsive in the long run (something like this: https://github.com/weidai11/cryptopp/issues/346)

Currently I did not evaluate whether we are susceptible problems listed above or not. That's why we should warn our users that we are not recommending executing code generated from untrusted ksys and untrusted inputs as a determined attacker may can cause harm.

Of course later we should evaluate whether these issues can arise or not to our best knowledge and solve them if we can.

KOLANICH commented 7 years ago

an attacker can create a specifically crafted .ksy file which can execute more privileged commands than the user would expect more concrete example: if we'll support direct WebIDE links to user created ksys, that ksy may call an opaque class called Function which actually calls Javascript's Function and executes the argument (this probably won't work as it would expect string), or use some built-in property eg. constructor to access security-critical property (eg. "".constructor.constructor("alert(1)")).

to mitigate this we need to make sure that ksc-generated code cannot overwrite anything out of its scope and cannot call arbitrary functions. functions to process must be registered ahead of time in a predefined place, classes should be put into a namespace.

an attacker can cause buffer-overflow or Denial of Service attacks by supplying specifically crafted payload to our generated parser

Not sure if it is possible without vuln in runtime for now. KSC uses stream-like API, so we need to make sure KS runtimes check bounds.

this may happen by causing integer overflow in expression and then abusing some parsing logic

IMHO Integer overflow may be possible, but not sure if it can be exploitable if unsigned types are used for offsets and counters. For example, if we have a variable-length array with terminator, and if a runtime stores its length somehow, it is possible to cause overflow the integer 2 ways: directly make array larger or make it large enough so when used in arythmetics cause an overflow or underflow. But ksy uses stream functions which are assumed to check bounds. So it is not exploitable when parsing. The problem here is that there can be external code assuming that no over/underflow occured and using the values without further checks. So we should detect overflows and throw exceptions. Some compilers have this feature, for another ones we'll need some asm/intrinsics.

or allocating too big buffer which makes a server which uses Kaitai non-responsive in the long run (something like this: https://github.com/weidai11/cryptopp/issues/346)

lazy parsing + memory-mapped files #133 #65?
some API to set resource consumption limits

Another problem is that the data can contain some indexes, which can be out of bounds. KSC uses stream API, but the app will use memory. So, KS runtime should provide

some routines to check offsets and pointers,
some checked index types bound to collection instance to distinguish between checked and unchecked indexes,
some collection types accepting only checked indexes of the same instance.

Checked indexes should be autoconstructed from unchecked (with check) if a preprocessor directive present which enables this behavior. It should be disableable because it has performance drawbacks, but at early stages we don't want to worry about this, but at some point we'll want to eliminate these checks.

And I think we need some fuzzing to check for possible problems.

GreyCat commented 7 years ago

an attacker can create a specifically crafted .ksy file which can execute more privileged commands than the user would expect

This is totally a separate issue, let's not mix them up. Regular end-users are not concerned about this: when one can craft a .ksy and compile it and run resulting code, basically everything can happen, and I don't think it's a good idea to invent some roadblocks here. ksc generates text, not some ready-made syntactic tree or anything, so it's not protected against any injections or anything, but that's totally ok with me. I strongly think that it's a bad idea to introduce any "security" features here, like deliberately disabling calls to external code, etc, etc., as this is very hard to maintain and prove to provide proper level of isolation. If someone needs doesn't trust ksy (and thus generated code), it needs to be sandboxed by environment (i.e. OS sandboxing, web worker sandboxing, whatever), period.

Regular users who generate parsers are concerned by this as much as "can gcc be used to compile arbitrary code?" Yes, it can, but it's not user's security problem.

The only people who might be concerned are tools developers, i.e. us. And here we need to provide a normal level of security expected from web applications, i.e. XSS, CSRF, what else's possible to abuse in a serverless web app?

Currently I did not evaluate whether we are susceptible problems listed above or not. That's why we should warn our users that we are not recommending executing code generated from untrusted ksys and untrusted inputs as a determined attacker may can cause harm.

C'mon, this is code, so it's common sense to not execute code (even generated one) that you don't trust in your valuable environment? Do we really need to go to these basics?

koczkatamas commented 7 years ago

This is indeed a different issue, maybe it was not a good idea to create the same Github issue for the both.

And yes, you are right as long as we have a distinct code generation step in place and a manual "run this code" step.

Currently this is not true for the WebIDE as it runs the ksy directly. I presume the situation is the same with ksv. I think the user will blindly trust the ksy if he/she use it with the WebIDE or ksv, so I think the warning is justified. (It's like you don't expect from a Word document that the sender will able to run code on your machine.)

GreyCat commented 7 years ago

It's like you don't expect from a Word document that the sender will able to run code on your machine.

True, that's a valid argument. Then, I guess, we could add it in a format reply to question like "how do I ensure that random ksy I've got off the net is not harmful?"