usethesource / capsule

The Capsule Hash Trie Collections Library
BSD 2-Clause "Simplified" License
404 stars 27 forks source link
hashmap hashset immutable immutable-collections java performance persistent-data-structure trie

The Capsule Hash Trie Collections Library

Status

capsule build status

Synopsis

Capsule aims to become a full-fledged (immutable) collections library for Java 11+ that is solely built around persistent tries. The library is designed for standalone use and for being embedded in domain-specific languages. Capsule still has to undergo some incubation before it can ship as a well-rounded collection library. Nevertheless, the code is stable and performance is solid. Feel free to use it and let us know about your experiences!

Getting Started

Binary builds of Capsule are deployed in the usethesource repository. In case you use Maven for dependency management, you have to add another repository location to your pom.xml file:

<repositories>
  <repository>
    <id>usethesource</id>
    <url>https://releases.usethesource.io/maven/</url>
  </repository>
</repositories>

Furthermore, you have to declare Capsule as a dependency.

To obtain the latest release for Java 11+, insert the following snippet in your pom.xml file:

<dependency>
  <groupId>io.usethesource</groupId>
  <artifactId>capsule</artifactId>
  <version>0.7.1</version>
</dependency>

To obtain the latest available version for Java 8, insert the following snippet in your pom.xml file:

<dependency>
  <groupId>io.usethesource</groupId>
  <artifactId>capsule</artifactId>
  <version>0.6.4</version>
</dependency>

Snippets for other build tools and dependency management systems may vary slightly.

Exploring Capsule

Build the library and spawn a Java shell to interactively explore Capsule, e.g.:

$ ./gradlew clean build
$ jshell --class-path ./build/libs/capsule-*-SNAPSHOT.jar

|  Welcome to JShell
|  For an introduction type: /help intro

jshell> var set = io.usethesource.capsule.Set.Immutable.of(1, 2);
set ==> {1, 2}

Background: Efficient Immutable Data Structures on the JVM

The standard libraries of recent Java Virtual Machine languages, such as Clojure or Scala, contain scalable and well-performing immutable collection data structures that are implemented as Hash-Array Mapped Tries (HAMTs). HAMTs already feature efficient lookup, insert, and delete operations, however due to their tree-based nature their memory footprints and the runtime performance of iteration and equality checking lag behind array-based counterparts.

We introduce CHAMP (Compressed Hash-Array Mapped Prefix-tree), an evolutionary improvement over HAMTs. The new design increases the overall performance of immutable sets and maps. Furthermore, its resulting general purpose design increases cache locality and features a canonical representation.

References and Further Readings

Talks

Publications