Restricting shared memory reads and writes to `unsafe` contexts

rbuckton commented 3 months ago

During the last Shared Structs call we discussed concerns raised about there being insufficient guardrails in place to prevent users from inadvertently introducing data races by interacting with shared data structures outside of some mechanism to synchronize access.

Non-thread-safe (e.g., "unsafe") Code Contexts

As a means to address these concerns, I propose that we limit read/write access to shared data to code running inside of a clearly labeled "unsafe" context, such as an unsafe {} block:

shared struct Counter {
  value = 0;
}

// normal JS code, outside of an "unsafe" context 
const ctr = new Counter(); // error? (TBD, allocates shared memory)
ctr.value = 1; // error (writes shared memory)
ctr.value; // error (reads shared memory)

// "unsafe" JS code
unsafe {
  const ctr = new Counter(); // ok
  ctr.value = 1; // ok
  ctr.value; // ok
}

How is `unsafe` a "guardrail"?

The unsafe keyword is a clear signal of intent that a developer is choosing to work with shared memory multithreaded code. The presence of an unsafe block is an indication to code reviewers that special care must be taken during review. It also is acts as a syntactic marker that future tooling (linters, type checkers, etc.) could use to identify data races.

The `unsafe {}` block and Block Semantics

An unsafe {} block is otherwise treated the same as a normal Block. Its only distinction is that it explicitly labels code within the block as potentially containing non-thread-safe (e.g., "unsafe") code. The general expectation is that any thread safety concerns should be addressed by the developer as control flow exits the unsafe block. For example, you could utilize using to synchronize access to a shared struct via a lock:

function incrementCounter(cnt, mutex) {
  unsafe {
    using lck = new UniqueLock(mutex);
    cnt.value++;
  }
}

Here, when the control enters the unsafe block, we allocate a lock against the provided mutex via a using declaration. As control exits the unsafe block, the lock tracked by using is released.

What does `unsafe` do?

The unsafe keyword is a syntactic marker that applies to lexically scoped reads and writes of the fields of a shared struct instance. Within an unsafe block, any lexically scoped accesses are permitted, even if they are nested within another function declared in the same block. This special lexical context shares some surface level similarities with the lexical scoping rules for private names, or the runtime semantics of "use strict".

Since unsafe is lexically scoped, it does not carry over to the invocation of functions declared outside of an unsafe context:

function increment(ctr) {
  ctr.value++; // error due to illegal read/write of `ctr.value` outside of `unsafe`
}
unsafe {
  const ctr = new Counter();
  increment(ctr);
}

Calling into, and out of, `unsafe` code

Thread-safe code may execute unsafe code without restriction, and unsafe code may do likewise. As unsafe already indicates a transition boundary between thread-safe and unsafe code, there is no need to declare all calling code unsafe as you might need to do for async/await. The unsafe keyword itself does not entail any implicit synchronization or coordination as that would be in opposition to our performance goals. Instead, the onus is on developers to be cognizant of thread safety concerns when they define an unsafe block. As such, a developer can choose the coordination mechanism that best suits the needs of their application, be that a Mutex, a lock-free concurrent deque, etc.

Other `unsafe` Contexts

It should be noted that this proposal is not limited to merely unsafe {}, but also proposes the addition of unsafe as a contextual keyword in a number of other contexts so as to improve the developer experience. As such I also propose the following as possible unsafe contexts:

Blocks — unsafe {} (as discussed above)
Functions — unsafe function f() {} (or function f() unsafe {}, etc.)
Arrow Functions — unsafe () => { }
Async & Generator Functions — unsafe async function f() {}, unsafe function* g() {}, etc.
Methods — unsafe m() {}
Getters/Setters — unsafe get value() { ... }
Constructors — unsafe constructor() {} (though a shared struct constructor might implicitly be unsafe)
Field Initializers — unsafe x = ...;
static blocks — unsafe static { }
class/struct/shared struct bodies — unsafe shared struct S { ... }
Variable initializers — unsafe const x = ... (to avoid needing an unsafe IIFE, splitting into a let, etc.)

There may also be other contexts we may wish to consider in the future as well.

Examples

Atomic values

/**
 * All public operations are guaranteed to be sequentially consistent across multiple threads/agents.
 * Roughly equivalent to `std::atomic<Integral>`, less assertions omitted for brevity
 */
export shared struct AtomicValue {
  #box;
  unsafe constructor(value) {
    this.#box = new Box();
    this.#box.value = value;
  }
  unsafe get value() {
    return Atomics.load(this.#box, "value");
  }
  unsafe set value(v) {
    Atomics.store(this.#box, "value", v);
  }
  unsafe exchange(value) {
    return Atomics.exchange(this.#box, "value", value);
  }
  unsafe compareExchange(expected, replacement) {
    return Atomics.compareExchange(this.#box, "value", expected, replacement);
  }
  unsafe add(value) {
    const box = this.#box;
    // use CAS and spin-waiting to remain lock-free and avoid contention
    let spinCounter = 0;
    let current = Atomics.load(box, "value");
    while (current !== Atomics.compareExchange(box, "value", (currentValue + value) | 0)) {
      Atomics.pause(spinCounter);
      spinCounter = (spinCounter + 1) | 0;
      current = Atomics.load(box, "value");
    }
    return current;
  }
  unsafe sub(value) {
    const box = this.#box;
    // use CAS and spin-waiting to remain lock-free and avoid contention
    let spinCounter = 0;
    let current = Atomics.load(box, "value");
    while (current !== Atomics.compareExchange(box, "value", (currentValue - value) | 0)) {
      Atomics.pause(spinCounter);
      spinCounter = (spinCounter + 1) | 0;
      current = Atomics.load(box, "value");
    }
    return current;
  }
  increment() { return this.add(1); } // no need for `unsafe`
  decrement() { return this.sub(1); } // no need for `unsafe`
}

// NOTE: Indirect access through `Box` is necessary for now as there is no way to
//       pass a private name to the `Atomics` functions.
shared struct Box { value; }

Worker Coordination

import { Worker, isMainThread, workerData } from "node:worker_threads";

shared struct Data {
  ready;
  processed;
}

function worker_thread() {
  const [m, cv, data] = workerData;

  unsafe {
    // take lock on mutex
    using lck = new Atomics.UniqueLock(m);

    // release lock and wait until `main()` sends data
    cv.wait(lck, () => data.ready === 1); // NOTE: `data.ready` is ok to read due to `unsafe` scope
    // we once again own the lock

    console.log("worker thread is processing data...");

    // hand control back to main()
    data.processed = 1;
    console.log("worker thread is done");

  } // lock is released

  // if `main()` is paused in a `cv.wait()` waiting for us to finish, wake them up
  cv.notify(1);
}

unsafe function main() {
  const m = new Atomics.Mutex();
  const cv = new Atomics.Condition();
  const data = new Data();

  const worker = new Worker(__filename, {
    workerData: [m, cv, data],
    stdout: true,
  });
  worker.stdout.pipe(process.stdout);

  // send data to worker
  {
    // pause until we take the lock. in the event of a race with `worker_thread()`, we will
    // gain control when the worker releases the lock in the call to `cv.wait()`.
    using void = new Atomics.UniqueLock(m);

    data.ready = 1; // ok because entire function is `unsafe`
    console.log("main is ready");

  } // lock is released

  // If `worker_thread()` is paused in `cv.wait()`, it will wake up and try to retake the lock.
  cv.notify(1);

  {
    // pause until we take the lock. in the event of a race with `worker_thread()`, we might not
    // regain control until the worker is done
    using lck = new Atomics.UniqueLock(m);

    // in the event the worker hasn't started yet, release the lock and wait for the worker to signal
    // completion
    cv.wait(lck, () => data.processed === 1);
    // we once again own the lock

    console.log("main is done");
  } // lock is released
}

if (isMainThread) {
  main();
}
else {
  worker_thread();
}

syg commented 3 months ago

I like the lexical semantics of this a lot.

Bikeshedding-wise, unsafe is a pretty broad descriptor. On the one hand, it seems reasonable to imagine lumping other features we deem unsafe into this safe/unsafe divide in the future. On the other hand, it is also reasonable to consider thread safety to be unique and have a syntactic marker narrowly scoped to allowing possibly thread unsafe code, in which case a different word than unsafe would better capture the intent.

iainireland commented 3 months ago

Writing down comments from the working session:

It's not clear that the addition of additional unsafe contexts (beyond the bare unsafe { ... } block syntax) is necessary. There seems to be general consensus that the block syntax is a good idea, but the majority of the participants in the call felt that additional syntax could wait until we had more developer feedback indicating that the user experience of function foo() { unsafe { ... } } was enough worse than unsafe function foo() { ... } to justify adding additional complexity.

Given that function colouring is an explicit non-goal of this proposal, I would further argue that unsafe function foo() {...} is potentially misleading. Existing modifiers to function declarations (async and *) affect the return type of the function in a way that matters to the caller.

This proposal is inspired by Rust's unsafe keyword. But note that in Rust, unsafe function foo() {} defines a function that can only be called from an unsafe context. That enables Rust programmers to precisely specify which code is responsible for maintaining safety invariants (although see this for an explanation of why Rust unsafety actually pollutes code out to the module boundary), but validating it at runtime would impose additional requirements on implementations.

For example, this version of unsafe functions enables idioms like unsafe function foo_alreadyHoldingLock() {...} that impose safety requirements on the caller.

As a very different language from Rust, it's unclear whether JS would benefit from a similar kind of function colouring. By restricting the initial proposal to only support unsafe blocks, we leave flexibility to decide based on user experience whether we want function colouring or not.

For these reasons, I think we should remove "Other unsafe contexts" from this proposal.

syg commented 1 week ago

This has also landed in the spec draft, so closing. Feel free to open up more specific issues regarding unsafe blocks.

tc39 / proposal-structs