tinyplex / tinybase

The reactive data store for local‑first apps.
https://tinybase.org
MIT License
3.57k stars 69 forks source link

Add errors #6

Closed AHBruns closed 2 years ago

AHBruns commented 2 years ago

First off, thank you for a great library. Using a client-side relational database is, in my opinion, the clear future for large frontend applications. However, after trying Tinybase out there is a key feature that I find myself really missing: errors.

Tinybase has cell types and default values. That is, I can enforce my data never has the wrong type, and I can substitute a default value when applicable, but I can't throw an error when my app tries to add invalid data to my store. Everything fails silently. This is fine in some circumstances, but is very much not in others.

I'd like a way to say, this cell is required. This should throw something akin to a Postgres NotNullConstraint error. This also goes for foreign key constraints, check constraints, etc.

In my opinion there should be no requirement to use these constraints nor even a schema, but they should be available.

For context, my current work around is to set a unique default value for fields which I require which is used no-where else in my application. I then use listeners to watch for this value and, if they find it throw an error. This is very brittle. I'd much prefer a built-in solution.

AHBruns commented 2 years ago

This can also be done by not using a cell schema and instead enforcing a cell schema manually via listeners, but again, this is brittle and very verbose.

Actually, upon further trial and error this has all sorts of issues where the listener just never gets called. For instance, setting a row to an empty object will effectively do nothing, so no listeners will be called. As I see it, the best option in user space is to enforce all writes go through a custom interface which manually calls all relevant "constraint listeners" which is a bit annoying since it requires a good bit of code + the logic around determining what listeners to call is non-trivial.

jamesgpearce commented 2 years ago

I understand the use case perfectly and would like to figure out something here. I don't want to start throwing JS exceptions necessarily, but a lightweight signal back from the setter that something didn't happen as you might have expected... I get it. Let me think about this one?

AHBruns commented 2 years ago

Had some ideas bouncing around in my head, and thought I'd leave them here.

The query engine could be responsible for enforcing constraints which would be declared via schemas. This way, when using the SQL-esq query engine API, you'd get relational database-esq errors, but when using the low-level "raw" API, you'd access the "raw" store and circumvent any constraints. I'd further propose a "strict" mode for stores which limits them to the query engine API to ensure no constraints are violated during their lifetime. Maybe only allow constraints in strict mode too?

This comes down to "when I'm using my store like a database, I want it to act like a database" and "when I'm using my store like an object, I want it to act like an object". Obviously the underlying store is identical, but the mindset is different.

AHBruns commented 2 years ago

Also, instead of errors, you could always return a result type where the left value was a query's result, and the right value was a constraint violation which prevented the query from succeeding. Could even make use of fp-ts's Either for easy interop with those already using fp-ts.

jamesgpearce commented 2 years ago

My philosophy has been validate on write (on a premise that reading 'valid' data needs to be the fast path), so I'd like to see if we can do that here.

I can't really see a way to make this a 'peer' framework like the Metrics or Indexes objects - it'll have to be baked into the Store setters. That's fine I think... I am hoping it's just a few hundred bytes at most.

My best shower thoughts on this have been:

  1. The listener convention: an InvalidDataListener callback (registered with addInvalidDataListener) so you can do whatever you want when you know invalid data has been dropped. These will probably need to fire before the mutator listeners, and then again afterwards (in case the mutators themselves generate new invalid data). Should still be fast even with two new event phases though.
  2. An optional callback of some sort on the setters: eg .setTables('whoops', (err) => {...}). That seems a little more clumsy, especially for the setters that already have lots of parameters. It'll be nice to know something went wrong exactly at the time you tried to set it though.
  3. Both. The callback should be an InvalidDataListener
  4. ~Neither.~

Thoughts? I might try out 1 anyway and see what the overhead is.

jamesgpearce commented 2 years ago

@AHBruns, I'm going to push ahead with option 1.

It adds 282 bytes (<5%) to the prod .js and 82 bytes (3%) to the .js.gz - which seems like excellent value for money :)

The API will be:

type InvalidCellListener = (
  store: Store,
  tableId: Id,
  rowId: Id,
  cellId: Id,
  invalidCell: any,
) => void;

addInvalidCellListener(
  tableId: IdOrNull,
  rowId: IdOrNull,
  cellId: IdOrNull,
  listener: InvalidCellListener,
  mutator?: boolean,
): Id;

It's very similar to CellListener/addCellListener and acts exactly as you expect. I'll PR this one from the errors branch so you can kick the tires and check it works as you need before publishing.

AHBruns commented 2 years ago

@AHBruns, I'm going to push ahead with option 1.

It adds 282 bytes (<5%) to the prod .js and 82 bytes (3%) to the .js.gz - which seems like excellent value for money :)

The API will be:


type InvalidCellListener = (

  store: Store,

  tableId: Id,

  rowId: Id,

  cellId: Id,

  invalidCell: any,

) => void;

addInvalidCellListener(

  tableId: IdOrNull,

  rowId: IdOrNull,

  cellId: IdOrNull,

  listener: InvalidCellListener,

  mutator?: boolean,

): Id;

It's very similar to CellListener/addCellListener and acts exactly as you expect. I'll PR this one from the errors branch so you can kick the tires and check it works as you need before publishing.

Will try it out this evening. Glad to see such quick movement.

jamesgpearce commented 2 years ago

give me a few hours - I want to get a bunch of test cases in there

jamesgpearce commented 2 years ago

@AHBruns give 1.1.0-beta.0 a try and see if that works.

AHBruns commented 2 years ago

@AHBruns give 1.1.0-beta.0 a try and see if that works.

Going to bed now. Will check and report back tomorrow morning.

jamesgpearce commented 2 years ago

Note I changed the API slightly. The listener gets an array of failed values, in case you tried multiple times throughout a transaction. Docs and test cases are in there too in case you need to see how to use it.

AHBruns commented 2 years ago

So, it's good, but does seem to have issues with a few edge cases, for instance the following will not call the listener:

import { createStore } from "tinybase";

const store = createStore().setSchema({
  pets: {
    species: { type: "string" },
    sold: { type: "boolean", default: false },
  },
});

store.addTablesListener((store) => {
  console.log(store.getJson());
});

store.addInvalidCellListener(
  "pets",
  null,
  null,
  (store, tableId, rowId, cellId, invalidCells) => {
    console.log(tableId, rowId, cellId, invalidCells);
  }
);

store.addRow("pets", {});
AHBruns commented 2 years ago

It also doesn't handle "invalid" schema data how I'd expect. For instance, this

import { createStore } from "tinybase";

const store = createStore().setSchema({
  pets: {
    species: { type: "string" },
    sold: { type: "boolean", default: 1 },
  },
});

store.addTablesListener((store) => {
  console.log(store.getJson());
});

store.addInvalidCellListener(
  "pets",
  null,
  null,
  (store, tableId, rowId, cellId, invalidCells) => {
    console.log(tableId, rowId, cellId, invalidCells);
  }
);

store.addRow("pets", { species: "" });

Doesn't call the invalid listener, but also doesn't use the provided default because it's the wrong type. This is probably less important since this is only an issue when the schema API is misused. Still, I think an invalid cell listener should be triggered here.

AHBruns commented 2 years ago

I do think that it's a good basic API though. I would say that the interaction with transactions could be improved though. Right now they transaction doesn't care if an invalid cell listener is called. I'd expect, an invalid cell listener call to stop and rollback a transaction.

For instance, this:

import { createStore } from "tinybase";

const store = createStore().setSchema({
  pets: {
    species: { type: "string" },
    sold: { type: "boolean", default: true },
  },
});

store.addInvalidCellListener(
  "pets",
  null,
  null,
  (store, tableId, rowId, cellId, invalidCells) => {
    console.log(tableId, rowId, cellId, invalidCells);
  }
);

store.transaction(() => {
  store.addRow("pets", { species: "" });
  store.addRow("pets", { species: 1 });
  store.addRow("pets", { species: true });
});

console.log(store.getJson());

shouldn't result in this, imo.

alexbruns:tinybase-beta-testing/ $ node store.mjs                                                                                                                                                                                                                                         
pets undefined species [ 1, true ]
{"pets":{"0":{"species":"","sold":true},"1":{"sold":true},"2":{"sold":true}}}
jamesgpearce commented 2 years ago

Nice. I think both those edge cases should be made to work.

For the transactions part, I am open to having a parameter on the public .transaction method that makes it rollback if there's an issue. This may be a little costlier (and trickier to implement lol) so not sure that it should be true by default.

jamesgpearce commented 2 years ago
store.addRow("pets", {});

For this one, would you be ok if the InvalidCellListener got called with an "undefined" cellId? Your first example console output would return something like:

"pets", "undefined", "undefined", [undefined]

It's not perfect that undefined is a string here for cellId, but it already is for rowId when it's an .addRow that failed, because the validation occurs before a new rowId is assigned.

jamesgpearce commented 2 years ago

2f598741017a23a8c1d9295e13cb046861e48a5e covers the first edge case and adds a bunch of unit tests to demonstrate.

AHBruns commented 2 years ago

I think that'd be fine, though I'm not sure I follow why the cellId is "undefined" here. Presumably, even without a rowId, the listener is being called on a per-cell basis which means it must have a cellId. I would have expected:

"pets", "undefined", "species", [undefined]

or even better

"pets", undefined, "species", [undefined]

If this isn't possible, I think I'd prefer a more uniform solution with an invalidRow and invalidTable listener for these cases, but I think that brings up more edge cases about which listeners fire when. Also, a few more bytes to the bundle size.

brielov commented 2 years ago

Also, instead of errors, you could always return a result type where the left value was a query's result, and the right value was a constraint violation which prevented the query from succeeding. Could even make use of fp-ts's Either for easy interop with those already using fp-ts.

I have a library which is a port of rust's Result and Option type in < 800 bytes. It can be an option.

jamesgpearce commented 2 years ago
"pets", undefined, "species", [undefined]

Yeah that's what you'll get. About to crank another release so you can make sure I caught all the edge cases.

Bonus: the events also fire if you change the schema after data has been added, so you can see what effect changing the schema has had.

jamesgpearce commented 2 years ago

It also doesn't handle "invalid" schema data how I'd expect.

This won't work as you requested because I forgot that actually the schema itself gets normalized/corrected when it's invalid. So by the time it comes to validate cells it never knew there was ever a default of the wrong type.

jamesgpearce commented 2 years ago
store.addRow("pets", {});

For this one, just to clarify what happens:.

I'm proposing the fourth case here because you did want a row, with the defaults in it, and you didn't actually provide any invalid data in getting it.

Lots to document to make these combinations clear!

jamesgpearce commented 2 years ago

Take v1.1.0-beta.1 for a spin?

AHBruns commented 2 years ago

Will take a look tonight or tomorrow morning at the latest. Also, thinking about the 4 cases you outlined, I think, in the 4th case it should not throw only when all cells have a default value since these two should act identically, imo:

store.addRow("pets", {});

and

store.addRow("pets", { species: undefined, sold: undefined });

The later throws an invalid event which seems reasonably to me since undefined is not of type string. The former doesn't throw. The argument for them being different is that lack of presence in an object isn't the same and as being set to undefined in JS (which is dumb), but I think carrying that into the store's semantics would be a needlessly confusing design decision. Imo, undefined === missing.

jamesgpearce commented 2 years ago

I agree that explicitly setting undefined is a gnarly one, since:

Object.keys({b: undefined});
// ['b']

And then what to 'throw'? There are infinite fields you could have put in there as undefined in value ;) And the operation did work in some sense.

Maybe something like an error for all the non-defaulted cells you didn't set? There are other legitimate times you might set partial rows that you know will be defaulted. So is that heuristic only for the empty object case? This is the best I've got so far though.

I am a little wary of going down the route of invalid Row and invalid Table handling, for perf and size cost. And would they exactly solve this semantic challenge?

We may just have to take a stance at some point and document these cases well.

jamesgpearce commented 2 years ago

PS just to say it's super fun riffing on these design decisions with another human, in contrast to the base APIs where I had to be my own devil's advocate! :)

AHBruns commented 2 years ago

Sorry, for the delay. Been busy with work. Thinking more about the semantics I'd want from this, I think it comes down to the schema. I think any cell with an invalid cell listener attached to it should conform to these invariants. Anytime an invariant is broken, at least 1 invalid event should be fired:

Right now, I think the first is at issue because of this rule "If there was a schema with a default value in any cell of the pets table, it will not throw an invalid event". This implies that the validity of one cell is dependent on the others in its row which seems wrong to me. I think every cell should be validated separately and the determination of when to validate should come down to "assuming the table was entirely valid before this update, which cells could have been made invalid". This means, creating rows should validate all cells of that new row, as defined in the schema, and updating a row should validate only the cells being updated.

Anyways, wrt the new release, it seems to conform with all the rules you laid out and I wasn't able to find any unexpected behavior. So, in that regard, I think it's ready to go.

AHBruns commented 2 years ago

Also, I agree, it's fun to mull over design decisions with someone else, and I'd like to apologize for not being more helpful in terms of actual development. I've been busy with work, but I'd like to help with actual code when things settle down a bit. Feel free to ping me if you have any simple issues you'd like to get off your plate that would be good introductions to the codebase.

jamesgpearce commented 2 years ago

OK, I can see an efficient way to fire events when optional cells are missing in a schematized row. I would like to change the implication from 'invalid' in the API though - since not adding an optional cell is kind of... valid.

Any objections if I rename this to CellWarningListener? That takes the edge off the semantics and coves a multitude of reasons why you would want to be alerted.

jamesgpearce commented 2 years ago

Alright, this is where things ended up:


const store = createStore().setSchema({
  pets: {
    species: {type: 'string'},
    color: {type: 'string', default: 'unknown'},
  },
});

const listenerId = store.addInvalidCellListener(
  null,
  null,
  null,
  (store, tableId, rowId, cellId) => {
    console.log(
      `Invalid ${cellId} cell in ${rowId} row in ${tableId} table`,
    );
  },
);

store.setRow('sales', 'fido', {price: 5});
// -> 'Invalid price cell in fido row in sales table'
// The listener is called, because the sales Table is not in the schema

store.setRow('pets', 'fido', {color: 'brown'});
// -> 'Invalid species cell in fido row in pets table'
// The listener is called, because species is missing and not defaulted...
console.log(store.getRow('pets', 'fido'));
// -> {color: 'brown'}
// ...even though a Row was set

store.setRow('pets', 'felix', {species: true});
// -> 'Invalid species cell in felix row in pets table'
// The listener is called, because species is invalid...
console.log(store.getRow('pets', 'felix'));
// -> {color: 'unknown'}
// ...even though a Row was set with the default value

store.setRow('pets', 'rex', {species: 'dog'});
console.log(store.getRow('pets', 'rex'));
// -> {species: 'dog', color: 'unknown'}
// The listener is not called, because color is defaulted

store.delTables().setSchema({
  pets: {
    species: {type: 'string'},
    color: {type: 'string'},
  },
});

store.setRow('pets', 'cujo', {});
// -> 'Invalid species cell in cujo row in pets table'
// -> 'Invalid color cell in cujo row in pets table'
// -> 'Invalid undefined cell in cujo row in pets table'
// The listener is called multiple times, because neither Cell is defaulted
// and the Row as a whole is empty

store.delListener(listenerId);
jamesgpearce commented 2 years ago

v1.1.0 is out, with this support

jamesgpearce commented 2 years ago

@AHBruns take a look at https://github.com/tinyplex/tinybase/releases/tag/v1.2.0-beta.0 - it takes a doRollback callback in the transaction method, so you can choose to roll everything back if things did not go as the client wanted.

AHBruns commented 2 years ago

Taking a look