share / sharedb

Realtime database backend based on Operational Transformation (OT)
Other
6.21k stars 449 forks source link

Feature suggestion: type accessors #611

Open alecgibson opened 1 year ago

alecgibson commented 1 year ago

Introduction

Mutating objects in sharedb is not necessarily the most intuitive thing to do, which is a shame for a library whose purpose is to mutate objects.

It often requires in-depth understanding of the types being used (by default json0) and how to construct ops for that type.

For users of TypeScript, op construction like this also removes type safety, and autocompletion suggestions.

For example, consider a simple json0 document:

{"foo": "bar"}

To mutate this, we need to consult the json0 docs, and then manually assemble an op:

[
  {
    "p": ["foo"],
    "od": "bar",
    "oi": "baz",
  }
]

As well as being cumbersome to write (and to read!), this can also be error-prone. If we're using TypeScript, we run in to these potentially avoidable errors:

Proposal

I'm proposing extension of the sharedb API to allow for a more fluent syntax, where consumers can (seemingly) mutate doc.data directly:

doc.data.foo = 'baz';

This update would be far easier for consumers to use, and would keep TypeScript checks.

It would also stop consumers from accidentally mutating the doc.data object (since this is now allowed!).

Implementation

For a fully-worked json0 example, please see below.

The basic concept is to use JavaScript Proxys to convert setter calls into ops.

Since conversion into ops is type-dependent, the implementation would belong in the types. We could potentially add an optional method to the type spec: accessor(), which returns a proxy object which may be "mutated" directly and instead submits ops.

If we want the type to be unaware of the Doc class (probably for the best), this may be done by having the accessor emit events that the Doc can subscribe to and then submit.

If we wanted to use super-modern JS features, we could potentially use for await...of and have the doc asynchronously iterate ops from the accessor.

Acknowledgement & error handling

Since consumers don't directly call submit() with this API, they also won't have an opportunity to register an error handler. For simple use cases, this may be "good enough" — ShareDB will still emit an error event if anything goes wrong, which can either be handled or leaved to let Node.js processes crash.

However, consumers may also want to wait for the op to be acknowledged by the server for some reason (eg UI updates; performing a dependent task; etc.). They may also want to handle errors on a per-op basis (especially if they want sensible stack traces).

Since setters aren't asynchronous, this would require addition of some mechanism to capture errors from these accessor updates.

One possible way is to just add something like:

class Doc {
  lastOpAcknowledged(callback) {
    if (!this.pendingOps.length) return callback();
    pendingOps[pendingOps.length - 1].callbacks.push(callback);
  }
}

Since ops will only be submitted in the next event loop tick, we can synchronously queue a number of changes, and then synchronously attach a callback:

doc.data.foo= 'baz';
doc.data.updated = Date.now();
doc.lastOpAcknowledged((error) => {
  // Handle errors, etc.
});

We could alternatively add some sort of helper method that will actively batch ops and attach a callback. Consumer code might look like:

doc.submitBatch(
  function() {
    this.data.foo = 'baz';
    this.data.updated = Date.now();
  },
  function(error) {
    // Handle errors, etc.
  },
);

Disadvantages

json0 example

Below is a basic example of what this might look like for json0.

The main complications come about from handling arrays (we manually override a number of array methods, taken from Vue's reactivity docs. There are also some corner cases around creating sparse arrays, which json0 doesn't support.

export function json0Accessor(doc) {
  const proxyCache = new WeakMap();
  const pathCache = new WeakMap();

  const ARRAY_OVERRIDES = {
    pop: (target) => function() {
      const [popped] = splice(target)(-1, 1);
      return popped;
    },
    push: (target) => function(...items) {
      const newLength = target.length + items.length;
      splice(target)(target.length, 0, ...items);
      return newLength;
    },
    unshift: (target) => function(...items) {
      const newLength = target.length + items.length;
      splice(target)(0, 0, ...items);
      return newLength;
    },
    shift: (target) => function() {
      const [shifted] = splice(target)(0, 1);
      return shifted;
    },
    splice,
    sort: () => () => {
      // Left as an exercise to the reader to implement using `lm`
      throw new Error("Unsupported method 'sort'");
    },
    reverse: () => () => {
      // Left as an exercise to the reader to implement using `lm`
      throw new Error("Unsupported method 'reverse'");
    },
  };

  return new Proxy(doc, {
    get(target, key) {
      if (key === 'data') return doc.data && deepProxy(doc.data);
      return target[key];
    },
  });

  function deepProxy(obj) {
    return new Proxy(obj, {
      get(target, key) {
        const value = target[key];

        if (Array.isArray(target) && key in ARRAY_OVERRIDES) {
          return ARRAY_OVERRIDES[key](target);
        }

        if (!value || typeof value !== 'object') return value;
        const parentPath = pathCache.get(target) || [];
        const path = parentPath.concat(normalizedKey(target, key));
        pathCache.set(value, path);
        if (proxyCache.has(value)) return proxyCache.get(value);
        const proxy = deepProxy(value);
        proxyCache.set(value, proxy);
        return proxy;
      },
      set(target, key, value) {
        key = normalizedKey(target, key);
        const p = pathCache.get(target).concat(key);
        let op = {p, od: target[key], oi: value};
        if (Array.isArray(target)) {
          // Setting length directly can create a sparse Array, which we can't do with json0
          if (key === 'length') throw new Error('Cannot set Array.length with writeableJSON');
          else if (typeof key !== 'number') throw new Error('Cannot set non-numeric keys on Arrays');
          // Setting keys outside our length also creates a sparse index
          if (key >= target.length) throw new Error('Cannot set index outside Array bounds');
          op = {p, ld: target[key], li: value};
        }
        submitOp(op);
        return true;
      },
      deleteProperty(target, key: PropertyKey) {
        // Using the delete keyword with an Array creates a sparse Array, which we can't do with json0
        if (Array.isArray(target)) throw new Error('Cannot use delete with arrays');
        key = normalizedKey(target, key);
        const p = pathCache.get(target).concat(key);
        submitOp({p, od: target[key]});
        return true;
      },
    });
  }

  function submitOp(op) {
    if (doc.type.uri !== 'http://sharejs.org/types/JSONv0') throw new Error(`Cannot with type '${doc.type.uri}'`);
    doc.submitOp(op);
  }

  function normalizedKey(target, key) {
    if (!Array.isArray(target)) return key;
    const normalized = +key.toString();
    return Number.isInteger(normalized) ? normalized : key;
  }

  function splice(target) {
    return function(start, deleteCount, ...items) {
      start = +start;
      deleteCount = arguments.length > 1 ? +deleteCount : target.length;
      if (start < 0) start = target.length + start;
      start = Math.min(start, target.length);
      start = Math.max(start, 0);

      const path = pathCache.get(target);
      const p = path.concat(start);
      const deleted = target.slice(start, start + deleteCount);
      for (const ld of deleted) submitOp({p, ld});
      let i = start;
      for (const item of items) {
        submitOp({p: path.concat(i), li: item});
        i++;
      }
      return deleted;
    };
  }
}
ericyhwang commented 1 year ago

From discussion today:

Qquanwei commented 1 year ago

I think for simple scenarios, proxy can avoid using json0 to construct op object, but for some cases, there will be problems. For example, if I have an array and I want to swap the position of 1,3, maybe use

tmp = data[3];
data[3] = data[1];
data[1] = tmp;

In this case we cannot identify whether to swap the order of the array or to assign values ​​to two elements.

alecgibson commented 1 year ago

@Qquanwei yes I agree; I've stated that in the "Disadvantages" section. I think for things like this, consumers would be expected to use the "traditional" submitOp() machinery they want specific semantics.