Nozbe / WatermelonDB

🍉 Reactive & asynchronous database for powerful React and React Native apps ⚡️
https://watermelondb.dev
MIT License
10.47k stars 587 forks source link

Override entity IDs #7

Closed sebastian-schlecht closed 5 years ago

sebastian-schlecht commented 6 years ago

Hey,

we are currently evaluating this for our app. Since we try to implement our own sync engine on top, I'd be curious if it's possible to specify entity IDs whenever you create one.

Right now, if I try to set an ID in the builder function, I get the following error:

Diagnostic error: Attempt to set new value on a property Model.prototype.id marked as @readonly

Great work by the way, this looks very promising!

radex commented 6 years ago

Yep, for sync you should use a lower-level API. I'll try to document this better sometime soon.

But in short, you can either use:

record._setRaw('id', myId)

But if you're pulling a record from the server, and you have a simple JSON-like object with the record's data, you can also do:

import { sanitizedRaw } from 'watermelondb/RawRecord'

collection.create(record => {
  record._raw = sanitizedRaw({ id: xxx, foo: x, bar: z }, collection.schema)
})
Kabangi commented 5 years ago

This record._setRaw('id', myId) doesn't seem to work.

I get the following error

screen shot 2018-10-25 at 7 23 09 pm

@radex Is there any change that is needed on schema or model level to make it work. Am using version 0.7.0-1

radex commented 5 years ago

right, this doesn't work for id. You can do record._raw.id = myId, but this might be unsafe — tell me what exactly you're trying to do

Kabangi commented 5 years ago

I want to use the same id am getting from my server as the record id. This will help me simplify detection of whether a record exists on the local cache using the withObservable HOC before attempting to pull it from the server.

In summary, I want to be able to use the same unique id from my server to perform findAndObserve

withObservables(['postId'], ({ postId, database }) => ({
  post: database.collections.get('posts').findAndObserve(postId)
}))
radex commented 5 years ago

so when exactly would you want to assign an ID to the record? In the sync code? Can you give me a larger snippet of code?

Kabangi commented 5 years ago

@radex

I want to assign it in sync code, i.e as soon as I receive the record from the server, I want to save the record locally and then have the HOCs picks it up using the observable. The need to use the same id is meant to make the code compatible with existing HOCs that uses the server id to query for a record. Note am migrating an existing code base to watermelon incrementally.

export function transformContactForStore(c, model: ContactModel) {
  model.id = c.id; //  I want to use the same id as the one am receiving from the server.
  model.sid = c.id;
  model.username = c.username;
  model.email = c.email;
  model.firstName = c.first_name;
  model.middleName = c.middle_name;
  model.lastName = c.last_name;
  model.gender = c.gender;
  model.dob = c.dob;
}
radex commented 5 years ago

@Kabangi have you figured it out? you need something like this:

const contactCollection = database.collections.get('contacts')
contactCollection.create(record => {
  record._raw = sanitizedRaw({ id: c.id, sid: c.id, username: c.username, .... }, contactCollection.schema)
})
Kabangi commented 5 years ago

Yes, ended up using ._raw.id=id, though am afraid in future there might be a breaking change in regard to this but can deal with it then

Kabangi commented 5 years ago

@radex There is a gotcha with this solution record._raw.id = myId. That is, the ids from the server on all the models that you have needs to be unique otherwise you endup having this kind of errors

Record ID was sent over the bridge, but it's not cached

This is mainly because watermelon uses a Map with the id as the key hence if you have different type of models with the same id they will collide on the cache layer

adam-aerobotics commented 5 years ago

@Kabangi I just noticed this issue myself! We're setting all the ids to our own database ids, but it seems a few that are relations have the same ids and are throwing that same error.

radex commented 5 years ago

correct, Watermelon assumes IDs are globally unique. But I don't think you should be getting "Record ID was sent over the bridge, but it's not cached" — since record caching is done per-collection.

adam-aerobotics commented 5 years ago

I'm looking into it, because it would be useful for me to be able to have the same ids across multiple models, and so far I've found that (at least for sqlite) the Native.query method seems to return a string of an object's id if it has previously returned another object with the same id, which then causes the cached error.

I would assume that this is due to how the caching works, so it only requests an object once, and caches based on the ids. Please do correct me if my assumption is wrong!

Would it perhaps be possible/feasible to change this caching behaviour to work with multiple models with the same ids, or do you want to leave the assumption that IDs are globally unique?

adam-aerobotics commented 5 years ago

Looking just at the iOS code, DatabaseDriver.swift seems to cache query records simply by storing their ids' in a set, I can't see anything indicating that the caching here happens per collection!

radex commented 5 years ago

Looking just at the iOS code, DatabaseDriver.swift seems to cache query records simply by storing their ids' in a set, I can't see anything indicating that the caching here happens per collection!

ah, correct. on JS side, caching Model objects happens per-collection, but native code assumes global IDs.

I don't think reusing IDs is a good practice. If you need a one-to-one relationship, you can just add a column pointing to another object.

It would be feasible to mark cached IDs per-table in native code, but it would require some work — I think it will be easier for you to tweak your schema a little not to reuse IDs. But let me know if there's a really good reason why you do

adam-aerobotics commented 5 years ago

@radex

Okay, thank you for the information! The only reason I was interested in doing it this way was to mirror our company db, which has sequentially numbered ids for each entry in a table, so duplicate ids across different tables. However, I agree it would be easiest to just modify the schemas to store the company db's id in a separate field, and allow WatermelonDB to generate its unique IDs 🙂

radex commented 5 years ago

However, I agree it would be easiest to just modify the schemas to store the company db's id in a separate field, and allow WatermelonDB to generate its unique IDs 🙂

you could store 11111 as table_name_11111 during sync to guarantee uniqueness

adam-aerobotics commented 5 years ago

@radex

I have just found a case where it would be useful to be able to specify the model ids directly: When there are associations. I don't believe there is a way to have an association without pointing to the parent or childs id field, is this correct? In this case it would be very useful to be able to specify our own ids, and allow duplicate ids across models, as we have quite a lot of models that relate to one another!

Perhaps I'll look into the native caching by model rather than the whole db.

radex commented 5 years ago

In this case it would be very useful to be able to specify our own ids, and allow duplicate ids across models, as we have quite a lot of models that relate to one another!

well again, can't you make your backend IDs unique by prepending them with table name?

PS. You could also fork watermelon and remove the "only send ID" optimization. you'd get warnings about not sending cached records, but that would work correctly, just a little slower

adam-aerobotics commented 5 years ago

That could work, I just feel that it's going to be a bit of a hassle prepending the table name to all primary keys and any foreign keys as well. I think for now I might just remove that optimization, just to speed up our development. But I'll look into how easy it would be to change the caching optimization to cache ids on a model basis, so it wouldn't globally cache ids.

Either way, thank you for an awesome library, still much happier using this than redux! And sorry for all the comments on this closed issue haha.

ahmadbaraka commented 5 years ago

I've submitted a PR to handle the comment by @adam-aerobotics @Kabangi and I will be using this on our project

Perhaps I'll look into the native caching by model rather than the whole db.

@radex Could you please review and let me know your feedback?

Kabangi commented 5 years ago

@ahmadbaraka Great work on this PR https://github.com/Nozbe/WatermelonDB/pull/175, indeed it does fully resolve the issue.

@adam-aerobotics If you still want to take advantage of the caching you could use this PR before @radex reviews it

To directly use it you might need this small script to install it on your project

git clone https://github.com/ahmadbaraka/WatermelonDB.git
cd WatermelonDB
yarn
yarn build

cd ../node_modules/@nozbe
rm -fr watermelondb
mv -f ../../WatermelonDB/dist watermelondb
rm -fr ../../WatermelonDB

Then add a postinstall command in your package.json pointing to the above script

@radex Do you think it would be worth to add the instructions above to the docs i.e how to consume a PR that is yet to be merged to the master

adam-aerobotics commented 5 years ago

@ahmadbaraka Thanks so much for the PR! This solves my problem exactly. It seems you work a lot faster than I do! 😄

@Kabangi Thank you for the script, I'll implement it in my project while I wait for the PR to be merged.

radex commented 5 years ago

@radex Do you think it would be worth to add the instructions above to the docs i.e how to consume a PR that is yet to be merged to the master

I believe it's written up in Contributing.md — can you verify the instructions there are also appropriate?

sebastian-schlecht commented 5 years ago

@radex we're re-iterating on our sync solution with WatermelonDB and consider using client-side IDs now. I stumbled over this thread again and thought - is there any reason not to make randomId configureable? We intend to use UUIDs here but I would not like to go through the hassle of using sanitizedRaw everywhere.

radex commented 5 years ago

is there any reason not to make randomId configureable?

Nope, I think it's reasonable to have some sort of an override. PRs appreciated :)

sebastian-schlecht commented 5 years ago

Alright - picking this one up as we will be needing it ;)

sebastian-schlecht commented 5 years ago

@radex maybe some input on design directions would be helpful here.

I thought it would make sense to add the id generator function to the table / app schema object.

radex commented 5 years ago

@sebastian-schlecht as far as i can tell, only sanitizedRaw() references the randomId() function. We could add a parameter to sanitizedRaw() taking an id generator but this is referenced in so many places I worry it would create a ton of noise just passing this function around…

So I would maybe consider doing it the "wrong" way (given that only a small handful of people would ever need this) — adding to utils/common/randomId an export that allows you to override the randomId generator globally. Icky, but… maybe?

Do a global search for randomId( and sanitizedRaw( and tell me what you think

sebastian-schlecht commented 5 years ago

@radex i thought about parameters too but indeed that would be a lot of work. I would agree with some way to override it (like the silence() ) for the database. I am however not 100% sure how the js module / require cache works - one would need to make sure that the override is used in all other files properly and not the (potentially) cached, original version of it?

On 19. Apr 2019, at 15:29, Radek Pietruszewski notifications@github.com wrote:

@sebastian-schlecht as far as i can tell, only sanitizedRaw() references the randomId() function. We could add a parameter to sanitizedRaw() taking an id generator but this is referenced in so many places I worry it would create a ton of noise just passing this function around…

So I would maybe consider doing it the "wrong" way (given that only a small handful of people would ever need this) — adding to utils/common/randomId an export that allows you to override the randomId generator globally. Icky, but… maybe?

Do a global search for randomId( and sanitizedRaw( and tell me what you think

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

Bessonov commented 5 years ago

I haven't look at source code closely, but maybe something like (pseudo code):

interface IdGenerator {
  (dirtyRaw: DirtyRaw, tableSchema: TableSchema): string
}

let idGenerator: IdGenerator = randomId

export const getIdGenerator = () => idGenerator

export const setIdGenerator = (newIdGenerator: IdGenerator) => idGenerator = newIdGenerator
sebastian-schlecht commented 5 years ago

PR is up.

Bessonov commented 5 years ago

@sebastian-schlecht hey, Sebastian, great, but it's only exchange one random generator by another random generator. Why not a more generic way with additional information? Then you can derive the id from table and db model. Furthermore, it could be possible to use server id. Or I'm missing something?

sebastian-schlecht commented 5 years ago

@Bessonov for our use-case we only need to have the random id to be a specific format, we don't need prefixes or anything similar right now. Regarding server-id's, I think I don't understand the connection here. The PR is currently only dealing with the client-side stuff. We are moving to have consistent IDs client and server-side to avoid mapping issues.

Bessonov commented 5 years ago

@sebastian-schlecht maybe I'm wrong, but here you have access to dirtyRaw and you can set id. So I assume, that this generator can just set server id:

setGenerator((dirtyRaw: DirtyRaw, tableSchema: TableSchema) => {
  if (tableSchema.name === 'tableWithServerId') {
    return dirtyRaw.myFieldWithIdWhichWasGeneratedOnServer
  }

  // original behavior for other tables
  return randomId()
})
sebastian-schlecht commented 5 years ago

@Bessonov I still think that I am missing things here - sorry 😄

So when you import raw records from the server (using Watermelon built-in sync) the IDs are taken from the server anyway. When you create stuff on device, there is a local ID to be generated (which the server would consume). Moreover, I am not that much into Watermelon to understand the ID field completely. Afaik, attempting to set the ID during record creation results in a constraint violation that IDs are read-only. See https://github.com/Nozbe/WatermelonDB/blob/08d8dd6df0e5f6734d4d367caff01f2639018da7/src/Model/index.js#L67

Bessonov commented 5 years ago

@sebastian-schlecht

Afaik, attempting to set the ID during record creation results in a constraint violation that IDs are read-only.

Ahmm... did you looked at my reference and code snippet? :) It is exactly how you can use generator to avoid the readonly problem.

dedene commented 5 years ago

@Bessonov FYI, the function call in https://github.com/Nozbe/WatermelonDB/blob/a5b9988e6e7724d3c39a45019829b2a459f7bca0/src/RawRecord/index.js#L84 does not pass dirtyRaw or tableSchema, so your snippet does not seem to work as these are always undefined. Or maybe I'm missing something?

Bessonov commented 5 years ago

@dedene yes, you missed, that it was a proposal to pass dirtyRaw and tableSchema to id generator, which allows to set arbitrary id based on this data. @sebastian-schlecht doesn't implement it in this way and @radex doesn't give a feedback on this.