p2panda / aquadoggo

Node for the p2panda network handling validation, storage, aggregation and replication
GNU Affero General Public License v3.0
70 stars 5 forks source link

Are there any chances to merge multiple schemas' data together? #590

Closed JasonkayZK closed 11 months ago

JasonkayZK commented 11 months ago

Hi there, i'm developing a local-first application recently, and i think p2panda can be a good choice.

But there is a problem here:

Each devices will create it's own schema for storing data especially when it's offline during the first launch.

So when the devices turned online, multiple schemas will be synced, such as person_002030..., person_002031....

But they all had the same name person in all_schema_definition_v1.

So, i'm wondering if it could merge multiple schemas' data together for query? (Sorry for not very familiar with GraphQL)

JasonkayZK commented 11 months ago

Or i could name schema on my own so it could sync all data automatically? 😃

adzialocha commented 11 months ago

Hi @JasonkayZK!

Each devices will create it's own schema for storing data especially when it's offline during the first launch.

Hui! Is this a sentence from us? That's indeed a bit confusing, we should improve that.

So when the devices turned online, multiple schemas will be synced, such as person_002030..., person_002031....

But they all had the same name person in all_schema_definition_v1.

So, i'm wondering if it could merge multiple schemas' data together for query? (Sorry for not very familiar with GraphQL)

If nodes want to collaborate on the same data (we call them "documents") they also need to use the same schemas. To find out if your schemas match you can look at their "schema id", for example: person_00201234..., the name and hash need to be the same. Now clients can create / read / update / delete documents following this schema with the GraphQL API.

To install a schema on a node (it is a little bit like a database migration) there are different strategies:

  1. As soon as aquadoggo finds another node on the network with the needed schemas it will automatically pull them over and register them. It is recommended to put the schema ids you need in the https://github.com/p2panda/aquadoggo/blob/main/aquadoggo_cli/config.toml#L30 allow_schema_ids list so you dont end up installing all random schemas flying around in the network
  2. If there's no other node you can get the schema from (for example if you're offline) you can also deploy that schema "manually" on the node. We have a tool for this (fishy deploy), check out https://github.com/p2panda/fishy - it's also handy for designing schemas in general. With this you make sure to install exactly the same schema whereever you are
adzialocha commented 11 months ago

Or i could name schema on my own so it could sync all data automatically? 😃

Exactly! So what you would do is:

  1. Design a schema with fishy, best is maybe to follow the tutorial in the README https://github.com/p2panda/fishy#tutorial
  2. Start an aquadoggo node
  3. Deploy the schema on the node with fishy deploy

From this point on your node will support this schema and clients can start using it

JasonkayZK commented 11 months ago

Thanks for your explaination. ❤️

fishy seems worked as command line tool, but what i need is much more like a library so that it could do some migrations while the tauri app start. 😢

JasonkayZK commented 11 months ago

I figured it out! Once i kept the same key pair to create the schema, the schema id was exactly the same just as the fishy did! But i think it might be a good idea to provide a library to do these stuffs just as the fishy do.

JasonkayZK commented 11 months ago

I saw this issue which is exactly what i want, thanks again for your amazing job!

https://github.com/p2panda/fishy/issues/3

adzialocha commented 11 months ago

Thanks for your explaination. ❤️

fishy seems worked as command line tool, but what i need is much more like a library so that it could do some migrations while the tauri app start. 😢

We've implemented this logic in our Meli Android app here: https://github.com/p2panda/meli/blob/main/packages/app/lib/io/p2panda/schemas.dart#L40-L83 - it is fairly easy and can be implemented in other languages in case you need it for your app and you don't want to wait for fishy to support it programmatically. The logic is:

JasonkayZK commented 11 months ago

Thanks for your explaination. ❤️ fishy seems worked as command line tool, but what i need is much more like a library so that it could do some migrations while the tauri app start. 😢

We've implemented this logic in our Meli Android app here: https://github.com/p2panda/meli/blob/main/packages/app/lib/io/p2panda/schemas.dart#L40-L83 - it is fairly easy and can be implemented in other languages in case you need it for your app and you don't want to wait for fishy to support it programmatically. The logic is:

  • Read schema.lock toml file
  • Iterate over all commits in the file, publish entry and operation data as always with GraphQL API
  • During every iteration step check if the entry already exists with nextArgs

That's great, i'm working on it right now!

Thanks again for this cool protocol. 👍