tinyplex / tinybase

The reactive data store for local‑first apps.
https://tinybase.org
MIT License
3.57k stars 69 forks source link

Clarity around getPersisted returning undefined #161

Open wattroll opened 1 month ago

wattroll commented 1 month ago

Describe the bug

Hi. Thank you for a great project. I am working on another persister and stumbled onto a small issue.

The type of createCustomPersister's getPersisted parameter permits returning Promise<undefined>, but during the load() the type of returned value is coerced (in setContentOrChanges) skipping the check for undefined which results in e.g. store.setContent failing like this:

Got TypeError: undefined is not iterable (cannot read property Symbol(Symbol.iterator))
    at setContent (...../tinybase/5.0.3/index.js:2934:22)
    at setContentOrChanges (...../tinybase/5.0.3/index.js:1223:28)
    .....
wattroll commented 1 month ago

Expected behavior

I think that persister.load() shouldn't throw when getPersisted returns undefined since returning undefined is permitted by a type and seems to me like a reasonable result of getPersisted when there was no content to load.

The documentation says:

getPersisted -- An asynchronous function which will fetch content from the persistence layer (or undefined if not present).

It's not entirely clear to me (based on documentation alone) whether the expected behavior upon returned undefined is that the store will be emptied akin to returning [{},{}] or just left unchanged. I suppose it's handy if returning undefined means "the data is not available, don't change anything" otherwise the only way to communicate this is by rejecting the promise which might be not ideal in cases when it's completely normal that i.e. remote persister isn't able to pull the remote data most of the time and there is no need to be logging an "oh no we are ofline" error regularly.

wattroll commented 1 month ago

I've also been thinking that since it's common for persisters to return result of JSON.parse<any> without putting it through any validation, it would make sense to have tests that ensure that createCustomPersister combined with setContent/applyChanges/setMergeableContent/applyMergeableChanges behaves reasonably (i.e. fails descriptively or quietly calls onIgnoredError) when getPersisted() returns unexpectedly shaped data. For instance in the remote persister it shouldn't be a big surprise that the loadUrl might suddenly start returning a valid JSON that doesn't fit the type of store's Content, e.g. { "error": "this aren't the jsons you are looking for" }.

jamesgpearce commented 1 month ago

Great analysis. I'm a bit surprised that undefined throws (the Store should ignore invalid content silently) but I'm sure you're right. Once I repro, I'll make sure that is the case, since yes, undefined kind of implies either the underlying file/store/source doesn't exist and the initialContent can be taken. Seems I need to dive back into this corner of the project!

jamesgpearce commented 1 month ago

I can't get it to throw, per se... it appears as an ignored error:

test.only('does not error on getPersister returning undefined', async () => {
  const store = createStore();
  store.setTables({t1: {r1: {c1: 1}}});
  const persister = createCustomPersister(
    store,
    async () => {
      console.log('undefined!!!');
      return undefined;
    },
    async () => {},
    () => 0,
    () => 0,
    console.warn,
  );
  await persister.load();
  expect(store.getTables()).toEqual({t1: {r1: {c1: 1}}});
});

Gives:

    console.warn
      TypeError: undefined is not iterable (cannot read property Symbol(Symbol.iterator))
          at setContent (/Users/james/code/github/tinyplex/tinybase/dist/index.js:2934:22)
          at setContentOrChanges (/Users/james/code/github/tinyplex/tinybase/dist/index.js:1223:27)
          at setContentOrChanges (/Users/james/code/github/tinyplex/tinybase/dist/index.js:1232:11)
          at run (/Users/james/code/github/tinyplex/tinybase/dist/index.js:1207:11)
          at schedule (/Users/james/code/github/tinyplex/tinybase/dist/index.js:1304:5)
          at Object.load (/Users/james/code/github/tinyplex/tinybase/dist/index.js:1230:7)
          at Object.<anonymous> (/Users/james/code/github/tinyplex/tinybase/test/unit/persisters/persisters.test.ts:351:3)
jamesgpearce commented 1 month ago

I can provoke similar behavior, however, by firing the persister listener with invalid content (because of the way the various parts of the persister are wrapped in try/catches):

test.only('autoLoad silent if persister listener returns invalid', async () => {
  let triggerListener = () => {};
  const store = createStore();
  store.setTables({t1: {r1: {c1: 1}}});
  const persister = createCustomPersister(
    store,
    async () => 1 as any,
    async () => {},
    (listener) => (triggerListener = listener),
    () => 0,
  );
  await persister.startAutoLoad();
  triggerListener(1 as any);
  expect(store.getTables()).toEqual({t1: {r1: {c1: 1}}});
});

Gives:

TypeError: number 1 is not iterable (cannot read property Symbol(Symbol.iterator))
    at setContent (/Users/james/code/github/tinyplex/tinybase/dist/index.js:1720:22)
    at setContentOrChanges (/Users/james/code/github/tinyplex/tinybase/dist/index.js:756:215)
    at /Users/james/code/github/tinyplex/tinybase/dist/index.js:785:11
    at Object.<anonymous> (/Users/james/code/github/tinyplex/tinybase/test/unit/persisters/persisters.test.ts:723:3

So will land both of theses test cases as well as more tolerance on the setContent method.

jamesgpearce commented 1 month ago

Tests in https://github.com/tinyplex/tinybase/commit/fbafcf19df29af69d73e1cf2a9eac407fc962010 and changes to the way things are handled in https://github.com/tinyplex/tinybase/commit/9f4d9eb211959acfa0787600dc197bbd577598df - both released in v5.0.5. Let me know if that fits the bill. If not, feel free to re-open with more suggestions!

wattroll commented 1 month ago

I can't get it to throw, per se... it appears as an ignored error:

You are totally right. I had a better look and confirmed that the actual throw of an error was coming from my own test harness (which throws the ignored errors when running my persister to expose potential issues in it). Sorry about not reproducing it properly. Apparently I was too quick to draw a wrong conclusion upon seeing an unintentional-looking "TypeError: undefined is not iterable" coupled with type coercion at its source.

wattroll commented 1 month ago

Tests in https://github.com/tinyplex/tinybase/commit/fbafcf19df29af69d73e1cf2a9eac407fc962010 and changes to the way things are handled in https://github.com/tinyplex/tinybase/commit/9f4d9eb211959acfa0787600dc197bbd577598df - both released in v5.0.5.

Looks great. Thanks! I believe on developer-end it's a lot nicer to see suggestive "Error: Content is not an array" in the log than a dreaded "TypeError: undefined is not iterable".

wattroll commented 1 month ago

I wanted to understand the new behavior properly to make sure my mental model is correct, if you have a moment for this, could you confirm that I got it right:

  1. If getContent returns invalid Content that is not an Array (e.g. undefined, null, { error: "Uuuupsie" }, 42):
    • validateContent returns false
      • store is left unchanged or set to initialContent (if available),
      • onIgnoredError is invoked with error message `Content is not an array ${content}`.
  2. If getContent returns invalid Content that is an Array (e.g. [], ["hey", "there"], [[],[]])
    • if validateTables return false
      • store's tables are left unchanged
      • InvalidCellListener is invoked as listener(store, undefined, undefined, undefined, [undefined])
    • if validateValues return false
      • store's values are left unchanged
      • InvalidValueListener is invoked as listener(store, undefined, [undefined])
wattroll commented 1 month ago

I don't have a strong opinion on what's the best behavior for when getContent returns a value that wouldn't have passed a type-checker for type Content = [Tables, Values]. However undefined is explicitly permitted by getContent's type signature and is documented to mean that content is not present. Because of that it doesn't seem right to me that it gets treated the same as other completely invalid values.

wattroll commented 1 month ago

Here is a concrete use-case when getPersisted wants to say "I've checked with remote source and our data is up-to-date, keep it as-it-is".

The current version of a remote persister keeps track of the last ETag and issues HEAD requests comparing old ETag with current (which is the best way to be compatible with every HTTP server that provides ETags). A variation of it could shoot the real request right-away with If-None-Match header containing the last ETag (which won't work well with servers that don't support that header, but otherwise will avoid the extra round-trip). Something like:

const getPersisted = async (): Promise<Content | undefined> => {
  const headers = lastEtag === null ? [] : { 'if-none-match': lastEtag };
  const response = await fetch(loadUrl, { headers });
  if (response.status != 304) {
    lastEtag = getETag(response);
    return jsonParse(await response.text());
  }
};

In this case it's really undesired for undefined to be reported as an error, as it indicates that everything is alright and we are in sync with the remote source. Still, using undefined as a sentinel for "leave the store unchanged" comes with it's own issues (especially in untyped javascript with returns inside condition branches), so I really don't want to be advocating for it's use. But it seems to me that it's useful for getPersisted to be able to communicate that somehow.

jamesgpearce commented 1 month ago

Yeah, that's a good point. Still some refinement to be done here (at least with making the semantics of undefined clearer). Your mental model on the whole is correct (except in 1, the validateContent is never reached... the persister checks it's an array for itself to decide whether to throw/log or not - since the Store is always silent by design)

wattroll commented 1 month ago

You are right, it seems that this is mostly about clarifying the meaning of undefined. When opening this issue I haven't yet realized how for example initialContent parameter ties into this. Now looking around the code and reading the rest of documentation it seems that my earlier intuition was wrong and undefined was never intended to be a sentinel value to mean "content is currently not available" or "content hasn't changed since last load".

So returning undefined is appropriate when persister is pointed at the resource that is known to contain no content. The purpose of returning undefined instead of throwing an error could be to indicate that initialContent can safely be used to initialize that resource (without a risk of overwriting the existing, but temporarily not available content). I suppose it also could be expected that a lawful persister pointed at a well-functioning resource will not return undefined after a successful setPersisted.

Is this correct more or less? I tried to think of some examples where getPersisted() could reasonably return undefined instead of throwing or returning empty content: