p2panda / aquadoggo

Node for the p2panda network handling validation, storage, aggregation and replication
GNU Affero General Public License v3.0
70 stars 5 forks source link

Configure blob and schema replication via dedicated config flags #564

Open sandreae opened 1 year ago

sandreae commented 1 year ago

System schemas sometimes require more specialised configuration logic compared to application schema. Currently we include all schema we want to replicate in an allowed_schema_ids list which includes both application and system schema ids, which isn't always that intuitive. Using dedicated config flags for configuring replication of system schema could help bring clarity.

schema definition replication

Add a flag for whether we want to replicate schema definitions.

replicate_schema_definition = true

This would result in our TargetSet including both schema_definition_v1 and schema_field_definition_v1.

blob replication

Globally configure (phase 1)

Add fields for configuring behaviour around blobs globally (would apply to all schema the node allows)

allow_schema_ids = ["photos_0020...."]
replicate_blob_meta = true
replicate_blob_data = false

The resulting TargetSet would contain blob_v1 but not blob_piece_v1 schema ids.

Fine grained control per schema (phase 2)

Configure on a per-schema basis

[[allow_schema_ids]]
id = "user_profile_0020a01fe..."
blob_meta = true
blob_data = true

[[allow_schema_ids]]
id = "huge_video_files_0020a01fe..."
blob_meta = true
blob_data = false

There would need to be multiple resulting TargetSets each with a dedicated replication session. The target sets would each contain one "parent" (application) schema id and then blob_v1 and blob_piece_v1 depending on their particular configuration.

Linked to https://github.com/p2panda/aquadoggo/issues/561 the blob_data field could later contain the on device storage space allotted to that blob type :fist_raised:

adzialocha commented 1 year ago

What happens if I write:

blob_meta = false
blob_data = true

..? It's a stupid example, but maybe we can not express the config as booleans but as "modes"? Something like this:

blobs_replication = "only_related_meta" | "only_related" | "*"
sandreae commented 1 year ago

Oh yeh, I really like the idea of having "modes" 👍

sandreae commented 1 year ago

Thought being able to configure "max_blob_size" would also be really good. As you would if you had a server accepting uploads.

adzialocha commented 1 year ago

Thought being able to configure "max_blob_size" would also be really good. As you would if you had a server accepting uploads.

Yeah! That's a nice addition, don't think they contradict each other