Update schema methods to not generate id based on schema values

TimoGlastra commented 2 years ago

As the identifiers can now be any URI, we should update the methods in the AnonCreds library to not generate the id values, but rather allow the user to generate the IDs themselves based on the AnonCreds method they're using.

[ ] Update the credx_create_schema ffi method based on the chosen approach
- remove origin_did as parameter
- remove the seq_no parameter from the schema creation (but keep it in the data model for now to not break cred def flow)
- based on the chosen approach, add an schema_id parameter or not
[ ] Update the create_schema issuer.rs method based on the choses approach
- remove origin_did parameter
- remove the seq_no parameter from the schema creation (but keep it in the data model for now to not break cred def flow)
- based on the chosen approach, add a schema_id parameter or not
- remove the generation of the schema_id based on the schema values
[ ] If the frist approach is taken, we need to update all validation logic and make the schema_id optional (but only in some cases). This requires some refactoring probably

One things to figure out is how we want to approach the generation of the ID, and whether it already needs to be present when we call the methods that create the objects. There's two approaches we can take:

Approach 1

Call the creation method (e.g. create_schema) without any identifier and then return the created object (the schema) without the id property. The schema is now created and the id property can be added later when the object is written to the ledger.

The advantage of this appraoch is that it allows the id generation process to be based on the contents of the object (schema), or it allows the id to be known after the object has been written to the ledger (if e.g. the ledger generates some identifier).

Approach 2

Call the creation method (e.g. create_schema) with the identifier and return the created object (the schema) with the id property.

The advantage of this appraoch is that is allows the anoncreds library to validate the identifier to be a valid URI / legacy indy identifier and we don't have a in-between representation of the model (all fields except the id).

TimoGlastra commented 2 years ago

@swcurran @blu3beri @victormartinez-work @dkulic Any thoughts on which approach would work best you think? I think approach 2 should be fine as you can almost always generate the identifier before you create the object, but curious to hear what you think.

For approach 1 the flow (in ACA-Py, AFJ) would be something like:

create schema without id property
call anoncreds method implementation to generate identifier
publish schema to ledger, get back schema object with id property

For approach 2 the flow would be something like:

call anoncreds method implementation to generate identifier
call create schema with id property
publish schema to ledger

I think the flow for approach 1 would be a bit cleaner as you don't have to go method specific -> generic -> method specific, but rather go generic -> method specific. However going with approach 2 makes the internal handling a lot easier as you don't have to make id an optional property on the schema model.

Edit: also tagging @andrewwhitehead

victormartinez-work commented 2 years ago

My suggestion would be to go for approach 2. I may have a high-level view, but I think would be good practice to apply a Builder pattern here, where atomically a scheme is created without an intermediate state, and I get back as a result an immutable object that later I'll publish to the ledger. I imagine something like that :

schema schema_object = schema.buildSchema(anoncreds.buildId(), param1(), param2(), etc..) ledger.publish(schema_object)

berendsliedrecht commented 2 years ago

One small, or big I am not too sure, benefit of the first approach is that we can easily reuse the anoncreds object to register it at different ledgers with different identifiers. If we take approach two we bind the id to the anoncreds object and therefore we cannot change it later, we have to regenerate it.

I am very unaware if this is a common use case, if it is it might be worth considering, but if it is not, I do not see a reason why we should not do Approach 2.

dkulic commented 2 years ago

Not sure what is the impact in case @blu3beri mentioned, but to me Approach 2 seams simpler and avoid confusion of having objects with and without an id.

swcurran commented 2 years ago

Approach 2 seems better to me, so that when you resolve the object you have a reference in it to where it was found. However, to resolve an object, you must already have its ID, so not really needed.

I assume we are thinking that whatever approach we use extends to all of the objects - CredDef, RevRegDef, RevRegEntry. In each, in theory, only the objects that reference another object must have the ID -- e.g. the CredDef must reference the ID of the schema, RevRegDef the CredDef, so in theory both work.

The identifier must be resolvable, so it's not like a database where you want content-less IDs that can be generated as the object is created. The IDs must be properly formed for resolution.

Sorry I'm not much help on this one...

TimoGlastra commented 2 years ago

Discussed WG 2022-11-07:

Id property is not needed in data model
With some methods the id is generated during publishing (to the ledger), so approach 2 wouldn't work in that case

Proposed solution: do not include id in models, create objects without identifier (as the data model doesn't require an id)

hyperledger / anoncreds-rs

Update schema methods to not generate id based on schema values #4

Approach 1

Approach 2