Diskussion about the possibility of Heap-Spray in JerryScript:

OverDriveGain commented 5 years ago

Hello everyone,

Heap Spray is a widely used payload-delivery technique, most JS engines nowadays integrate special mechanisms to mitigate it. Such as nozzle, bubble, etc. I recently came to conclusion that in JerryScript there are also a kind of mitigation to Heap-Spray, can you guys maybe confirm this and tell me what are the mechanisms being used in JerryScript to mitigate Heap-Spray?

If you refer me to any documentation that I can read considering this issue, I will also be very thankful.

Thank you so much.

akosthekiss commented 5 years ago

I recently came to conclusion that in JerryScript there are also a kind of mitigation to Heap-Spray

Could you please elaborate a bit on what has lead you to this conclusion?

OverDriveGain commented 5 years ago

I tried Heap-Spraying in JerryScript. I have analyzed the memory of the heap of JerryScript in run-time. After inserting hexadecimal representation of the Shell Code NOP (which is 0x90), I noticed that JerryScript has automatically added the value 0xc2 after each byte. In other words I ran the following script: var example = unescape("%90%90%90") And the memory ended up looking like this: 0x90 0xc2 0x90 0xc2 0x90 0xc2

akosthekiss commented 5 years ago

What you are seeing in the memory is the UTF-8/CESU-8 encoding of U+0090.

OverDriveGain commented 5 years ago

UTF-8 And CESU-8 are different. So which encoding do you mean exactly? Unfortunately I couldn't find any resources where it says that the encoding of U+0090 is (0x90 0xc2) neither in UTF-8 nor in CESU-8.

I am writing an encoded value in memory, so it should write it as it is, specially that I am using the unescape() function. And if your statement was true, then the following program: var example = unescape("%11%11%11") shouldn't produce this memory layout: 0x11 0x11 0x11

I thank you so much for your collaboration, and I would gladly take your reply as a confirmation of my conclusion.

akosthekiss commented 5 years ago

First, please refer to the following links

The Compatibility Encoding Scheme for UTF-16: 8-Bit (CESU-8) is a variant of UTF-8 that is described in Unicode Technical Report #26.[1] A Unicode code point from the Basic Multilingual Plane (BMP), i.e. a code point in the range U+0000 to U+FFFF, is encoded in the same way as in UTF-8.

CESU-8/UTF-8 are a variable length encoding:

code points: U+0000..U+007F encoded as 0xxxxxxx
code points: U+0080..U+07FF encoded as 110xxxxx, 10xxxxxx

I.e., for code points 0..127 CESU-8/UTF-8 is just like 7-bit ASCII. U+0011 gets encoded as 0x11.

As for U+0090: hex 90 is bin 10010000. Upper two bits (10) (plus padding zeros) go into 1st byte of encoding, lower six bits (010000) go into 2nd byte of encoding. This gives 110:00010, 10:010000 (colons added by me to emphasize boundary between fixed header bits and payload of encoding). In hex that gives 0xc2 0x90.

Please, take my reply, but I cannot confirm your conclusion.

OverDriveGain commented 5 years ago

Oh OK. Thank you very much again for your kind answer, it was very helpful. You replies were also informative about the UNI code representation. I appreciate it. Kind regards

jerryscript-project / jerryscript

Diskussion about the possibility of Heap-Spray in JerryScript: #2677