adobe / xdm

Experience Data Model
Creative Commons Attribution 4.0 International
245 stars 319 forks source link

Asset based XDMs are too content management centric. #217

Open cdegroot-adobe opened 6 years ago

cdegroot-adobe commented 6 years ago

What are the schemas that are affected by the issue

https://github.com/adobe/xdm/blob/master/schemas/assets/asset.schema.json https://github.com/adobe/xdm/blob/master/schemas/assets/video.schema.json https://github.com/adobe/xdm/blob/master/schemas/assets/image.schema.json

What are examples of products that are impacted by the issue:

AdCloud, Video Analytics

Video Image These all have inherited Asset required fields which do not easily translate to how AdCloud categorizes videos: AssetID – this has a very specific format requirement not currently implemented in AdCloud or the ad industry. Can we be more flexible on this field? eTag – information not currently available at event stream capture time format, lastModifiedDate, name, path, size, version – these are currently stored in our facts tables and not relevant at event processing time.

For consumption based processes the required fields are not as relevant and the information is not available in many cases.

We need to review if it makes sense for some of these required and strongly structured fields, or if we should use a different schema doe the consumption aspects of assets.

cdegroot-adobe commented 6 years ago

@trieloff @kstreeter - please look at this issue we need to address for the use of the asset schemas in the digital marketing area.

lrosenthol commented 6 years ago

@cdegroot-adobe The asset schemas were all developed as part of our core "asset data model" coming from how the DMe side of the house as well as the AEM folks see assets. We did, however, have some representation from various XC teams who reviewed them.

On your specific issues:

I am happy to meet with your DMa teams to review these as necessary.

fmeschbe commented 6 years ago

@lrosenthol says:

AssetID is defined according to the rules of ACP.

I cannot find any such format reference in the ACP API spec. In fact there are references to something like http://ns.adobe.com/adobecloud/core/1.0/assetID (which obviously does not resolve), repo:assetID, and just always the same (presumably) SharedCloud based example (which IMHO is not a good assetID since it seems to encode some location in the ID).

In addition there is confusion to the repo:assetID format since, formally, it is just a simple string. Only in the description it says:

A unique identifier given to every addressable asset in a given repository.

The format is a GUID-based URN. The pattern to generate an Asset ID is urn:aaid:{system}:{id} - {format}:{namespace}:{system}:{id}

which is not very clear. In fact this description prescribes the Asset ID to not just be an URN but an URN in the aaid namespace and then goes about a general format with seemingly duplicate fields. Oh, and RFC 4122 defines the URN to be of the form urn:uuid:<guid>. So our description is clearly not following RFC 4122. Maybe we should rather refer to RFC 8141 Uniform Resource Names (URNs) ?

Maybe we can just clarify how we want to look at this ?

How about this:

trieloff commented 6 years ago

I would split this into two issues:

  1. let's not make etag, format, lastModifiedDate, name, path, size, version required, so that in places where a reference to an asset is enough, only the assetID or @id can be used. This should address most of @cdegroot-adobe's concerns
  2. come up with a better format for assetID that addresses the suggestions @fmeschbe made
lrosenthol commented 6 years ago

I have to disagree on this one, @trieloff . If you want tocreate a new object called an assetReference - that's fine, I'd support that. But an asset actually exists and we know all of the info about it.

Concerning the format of assetID - that's one that we need to take up with @ogoldman, as he is the one who fought very strongly for that definition (and, to some extent, "owns" ACP).

lrosenthol commented 6 years ago

@trieloff but I do support splitting it into separate issues...

trieloff commented 6 years ago

If you want tocreate a new object called an assetReference - that's fine, I'd support that.

👍 me too.

Concerning the format of assetID - that's one that we need to take up with @ogoldman, as he is the one who fought very strongly for that definition (and, to some extent, "owns" ACP).

Then it's probably just updating the description, so that we don't claim following a standard we are knowably violating.

kstreeter commented 6 years ago

Regarding "assetID", it is definitely inconsistent (within the context of XDM) that it is a URN versus a URI as used elsewhere. We use URIs because that is the "JSON-LD way".

I know that "assetID" and its format are driven by how we have implemented in the past, and so I'm not suggesting we try to change it, but we might want to either allow it to be more flexible in the context of XDM (just a string) or have an adjacent field that would allow for a URI-based ID (an @id).

Of course, even if we do this, if someone wants to point to assets in the repo they will need to understand "assetID".

ogoldman commented 6 years ago

Here's my suggestion:

  1. For XDM, just define repo:assetID as a string. The value is required to be treated as opaque by clients, so suggesting an internal structure is both unnecessary and leads to bad behaviors. For XDM and API, it's only the semantics of this identifier that are important--when it is created, that it isn't re-used, and so on.

  2. Implementers should feel free to use some internal structure when assigning this value, as this is useful when troubleshooting. In our ACP implementations at Adobe, we happen to have agreed on a urn-based scheme that provides some hints as to which system generated the identifier, and in which region--in addition to a GUID. But again, these are purely implementation details, and need not be mentioned in XDM.

trieloff commented 6 years ago

Changing assetID to be a plain string seems to be an easy enough change.

@kstreeter: we already have an optional @id property for every asset.

Should we make assetID optional, given that the loose string prescription might lead to many empty strings anyway?

cdegroot-adobe commented 6 years ago

@trieloff @kstreeter - I would like to make a move on the Asset-Reference concept. We are preparing new schema proposals that refer to media (movies, audio clips etc) and they refer to an asset with most of the same properties, but will not always have all the required properties as in the Asset and related models. How should I proceed? It does not seem right to just create a mirror of the current schemas withthe addition or *Reference.