openspending / fiscal-data-package

MOVED TO https://github.com/frictionlessdata/specs/issues?q=is%3Aopen+is%3Aissue+label%3A%22Fiscal+Data+Package%22
24 stars 7 forks source link

Standardised names for logical model representation of physical fields #100

Closed danfowler closed 8 years ago

danfowler commented 8 years ago

In a previous version of the specification, the logical model representation of the physical fields in the resource used standardized, meaningful names (e.g. id, title, and description). Currently, names are underspecified and are left up to the discretion of the packager of the dataset. If we reverted back to standardized, meaningful names, this would be helpful for consumers of the spec as they would have a reasonable expectation of the meaning of the data being mapped from the physical resource. For instance, if a dimension has a single attribute, this attribute was its unique identifier, and by definition the primary key of the dimension. This would be codified by attribute.id. Similarly, the title key could be expected to be the label for said attribute in a user interface for an app that consumes the spec. In addition, if the source dataset had a description as well, this could be assigned the description key (possibly for use as a hover tooltip).

Proposal

Given a "physical" model (CSV) like this:

YEAR ECON4 ECON4-TITLE AMOUNT
2015 100 PERSONAL SERVICES 1000

ECON4 and ECON4-TITLE SHOULD look like this in the mapping:

    "id": { 
      "source": "ECON4"
    }, 
    "title": {
      "source": "ECON4-TITLE"
    }

We re-introduce standardized, meaningful names (id, title, and description) for the logical model representation of physical fields in the resource and describe their semantics. We consider this a SHOULD.

I see what could be 2 +1's below:

@pwalsh: https://github.com/openspending/fiscal-data-package/issues/96#issuecomment-162598473 @rgrp: https://github.com/openspending/fiscal-data-package/issues/96#issuecomment-162631757

Comments, amendments welcome. As they come in, I will revise this top post to reflect the final proposal.

pwalsh commented 8 years ago

So obviously I like this in general.

A few things:

  1. Be clear that this is not on mapping, but on attributes of dimensions
  2. I don't think you can use id, because you certainly can't claim it will be a unique identifier and therefore it might be confusing unless we also start to get specific in differentiating between id and pk (primary key) ( ref. https://github.com/openspending/fiscal-data-package/issues/96#issuecomment-162598473 )
stevage commented 8 years ago

So, to be clear, the id block above is a dimension attribute? Ie:

mapping: {
  dimensions: {
    "mydimension": {
      attributes: {
        "id": { 
          "source": "ECON4"
        }, 
        "title": {
          "source": "ECON4-TITLE"
        }        
      }
    }
  }
}

Should the "title" attribute also have the labelfor property set to ECON4, or am I not really understanding? (Or is that change contrary to this one?)

pwalsh commented 8 years ago

Hey @stevage I believe that labelfor was introduced as a way to deal with part of the problem I wanted to address with a minimal amount of standard field names. TBH I'm not quite sure if that means this issue should be closed as resolved or not.

@rgrp @danfowler

rufuspollock commented 8 years ago

Personally I think we close as WONTFIX and leave this as something that could be a pattern or suggested by examples rather than something the spec requires or even recommends.

@danfowler i will leave to you close one way or the other.

danfowler commented 8 years ago

@rgrp I'm a little concerned about suggesting a pattern by example without speaking to that pattern in the text of the spec, even if it's only a light recommendation.

@stevage I will say that it seems that this particular proposal is incompatible with the spec as it is currently stands given that we could have multiple code- and title-like attributes at the same level in a dimension. Also, as @pwalsh says, this is at least partially obviated by the existence of labelfor.