Determine required profile (e.g. "profile":"data-resource")

markboots commented 6 years ago

ONA is expecting the resource element to contain "profile: data-resource". I couldn't find that requirement in FLOIP and according to https://frictionlessdata.io/specs/data-resource/ that attribute is recommended, but not mandatory. If we're going to take it as mandatory we should explicitly state it in the FLOIP spec docs.

According to the Data Packages spec, profile is a recommended property, not required. If omitted, the default profile is 'data-resource':

https://frictionlessdata.io/specs/profiles/

Thus, we have a profile property that declares the profile for the descriptor for this Package or Resource. For the default Data Package and Data Resource descriptor, this SHOULD be present with a value of data-package/data-resource, but if not, the absence of a profile is equivalent to setting "profile": "data-package"/ "profile": "data-resource".

Custom profiles MUST have a profile property, where the value is a unique identifier for that profile. This unique identifier MUST be a string and can be in one of two forms. It can be an id from the official Data Package Schema Registry, or, a fully-qualified URL that points directly to a JSON Schema that can be used to validate the profile.

Do we require a custom Data Packages profile for Flow Results? Or can we use the default profile?

@ukanga @nditada ?

nditada commented 6 years ago

I don't see any need to make it REQUIRED.

ukanga commented 6 years ago

It seems according to the references you have put up on this issue that if we use a custom profile like we are doing so with 'flow-results-package' then we have to define the flow-results-package schema. If it is not required then we MUST NOT include the 'flow-results-package'.

I can work on creating this schema, it is one thing if allowed by my team would like working on.

If push comes to shove then I can look into getting rid of the profile before passing it through the package https://github.com/onaio/floip-py which relies on https://github.com/frictionlessdata/datapackage-py

ukanga commented 6 years ago

We will also need to define a custom profile schema for the data-resource, I think flow-results-data-resource or flow-results-resource something similar.

markboots commented 6 years ago

Way forward: three phases:

1) Change the profile definition "flow-results-package" to "data-package" in the Flow Results spec, since we can't have the profile included unless we define a validator schema for the profile. [Ona strips this in pre-processing even if a client sends it, for now].

2) Define a validator schema for the profile and post it to FLOIP github. Define profile as URL to the schema on github.

3) Submit profile to Data Packages for acceptance as a named profile, then we can update the spec to use "flow-results-package" again.

FLOIP / flow-results

Determine required profile (e.g. "profile":"data-resource") #31