nu11ptr / flexstr

A flexible, simple to use, immutable, clone-efficient String replacement for Rust
Apache License 2.0
148 stars 5 forks source link
inline refcount reference-counting rust string

flexstr

Crate Docs Build codecov MSRV

A flexible, simple to use, immutable, clone-efficient String replacement for Rust. It unifies literals, inlined, and heap allocated strings into a single type.

Table of Contents

Overview

Rust is great, but it's String type is optimized as a mutable string buffer, not for typical string use cases. Most string use cases don't modify their contents, often need to copy strings around as if they were cheap like integers, typically concatenate instead of modify, and often end up being cloned with identical contents. Additionally, String isn't able to wrap a string literal without additional allocation and copying forcing a choice between efficiency and storing two different types.

I believe Rust needs a new string type to unify usage of both literals and allocated strings for typical string use cases. This crate includes a new string type that is optimized for those use cases, while retaining the usage simplicity of String.

Example

String constants are easily wrapped into the unified string type. String data is automatically inlined when possible otherwise allocated on the heap.

See documentation or Usage section for more examples.

use flexstr::{local_str, LocalStr, ToLocalStr};

fn main() {
  // Use `local_str` macro to wrap literals as compile-time constants
  const STATIC_STR: LocalStr = local_str!("This will not allocate or copy");
  assert!(STATIC_STR.is_static());

  // Strings up to 22 bytes (on 64-bit) will be inlined automatically
  // (demo only, use macro or `from_static` for literals as above)
  let inline_str = "inlined".to_local_str();
  assert!(inline_str.is_inline());

  // When a string is too long to be wrapped/inlined, it will heap allocate
  // (demo only, use macro or `from_static` for literals as above)
  let rc_str = "This is too long to be inlined".to_local_str();
  assert!(rc_str.is_heap());
}

Installation

Optional features:

[dependencies.flexstr]
version = "0.9"
features = ["fast_format", "fp_convert", "int_convert", "serde"]

How Does It Work?

Internally, FlexStr uses a union with these variants:

The type automatically chooses the best storage and allows you to use them interchangeably as a single string type.

Features

Types

NOTE: Both types are identical in handling both literals and inline strings. The only difference occurs when a heap allocation is required.

Usage

Hello World

use flexstr::local_str;

fn main() {
  // From literal - no copying or allocation
  let world = local_str!("world!");

  println!("Hello {world}");
}

Creation Scenarios

use flexstr::{local_str, LocalStr, IntoSharedStr, IntoLocalStr, ToLocalStr};

fn main() {
  // From literal - no runtime, all compile-time
  const literal: LocalStr = local_str!("literal");

  // From borrowed string - Copied into inline string
  let owned = "inlined".to_string();
  let str_to_inlined = owned.to_local_str();

  // From borrowed String - copied into `str` wrapped in `Rc`
  let owned = "A bit too long to be inlined!!!".to_string();
  let str_to_wrapped = owned.to_local_str();

  // From String - copied into inline string (`String` storage released)
  let inlined = "inlined".to_string().into_local_str();

  // From String - `str` wrapped in `Rc` (`String` storage released)
  let counted = "A bit too long to be inlined!!!".to_string().into_local_str();

  // *** If you want a Send/Sync type you need `SharedStr` instead ***

  // From LocalStr wrapped literal - no copying or allocation
  let literal2 = literal.into_shared_str();

  // From LocalStr inlined string - no allocation
  let inlined = inlined.into_shared_str();

  // From LocalStr `Rc` wrapped `str` - copies into `str` wrapped in `Arc`
  let counted = counted.into_shared_str();
}

Passing FlexStr to Conditional Ownership Functions

This has always been a confusing situation in Rust, but it is easy with FlexStr since multi ownership is cheap. By passing as &LocalStr instead of &str, you retain the option for very fast multi ownership.

use flexstr::{local_str, IntoLocalStr, LocalStr};

struct MyStruct {
  s: LocalStr
}

impl MyStruct {
  fn to_own_or_not_to_own(s: &LocalStr) -> Self {
    let s = if s == "own me" {
      // Since a wrapped literal, no copy or allocation
      s.clone()
    } else {
      // Wrapped literal - no copy or allocation
      local_str!("own me")
    };

    Self { s }
  }
}

fn main() {
  // Wrapped literals - compile time constant
  const S: LocalStr = local_str!("borrow me");
  const S2: LocalStr = local_str!("own me");

  let struct1 = MyStruct::to_own_or_not_to_own(&S);
  let struct2 = MyStruct::to_own_or_not_to_own(&S2);

  assert_eq!(S2, struct1.s);
  assert_eq!(S2, struct2.s);
}

Make Your Own String Type

All you need to do is pick a storage type. The storage type must implement Deref<Target = str>, From<&str>, and Clone. Pretty much all smart pointers do this already.

NOTE:

Custom concrete types need to specify a heap type with an exact size of two machine words (16 bytes on 64-bit, and 8 bytes on 32-bit). Any other size parameter will result in a runtime panic error message on string creation.

use flexstr::{FlexStrBase, Repeat, ToFlex};

type BoxStr = FlexStrBase<Box<str>>;

fn main() {
  // Any need for a heap string will now be allocated in a `Box` instead of `Rc`
  // However, the below uses static and inline storage...because we can!
  let my_str = BoxStr::from_static("cool!").repeat_n(3);
  assert_eq!(my_str, "cool!cool!cool!");
}

Performance Characteristics

Benchmarks

In general, inline/static creates are fast but heap creates are a tiny bit slower than String. Clones are MUCH faster and don't allocate/copy. Other operations (repeat, additions, etc.) tend to be about the same performance, but with some nuance depending on string size.

Full benchmarks

Downsides

There is no free lunch:

Status

This is currently beta quality and still needs testing. The API may very possibly change but semantic versioning will be followed.

License

This project is licensed optionally under either: