astrofrog / numtraits

Sanity checking for numerical properties/traits :1234:
BSD 2-Clause "Simplified" License
36 stars 6 forks source link

provide path for to_json/from_json of units for front-end #12

Open bollwyvl opened 9 years ago

bollwyvl commented 9 years ago

One of the marquee users of traitlets that users would encounter is widgets, either by way of interact or directly, as they provide one of the best ways to access the interactive Jupyter/IPython magic.

As such, for many science and engineering applications, having the units persist all the way to the front-end is enormously useful.

To talk to the front end, the entire meaning of the traitlet value at one time must be communicated over serialization to JSON: JSON is actually poorly represented vs other serialization formats, i.e. XML in providing rich types. The reference implementation, for example, of providing an identifier for Widget itself uses a magic string notation to an ephemeral id, IPY_MODEL_, which frontends can only manipulate very mechanically.

For units, serializing the number to a string would introduce a whole set of issues, as the pidgin language of every unit library is slightly different.

A solution to this would be to utilize JSON for Linking Data, and leverage extensive existing work into solving this non-trivial problem... or at least representing it in a way that is not overly-opinionated.

Consider:

from astropy import units as u

class Sphere(Widget):
    radius = NumericalTrait(convertible_to=u.m)

s = sphere(radius=1.21)
print(s.radius)
>>> 1.21 meters

In JSON, and in JSON-LD, a number is a number:

{"radius": 1.21}

But in JSON-LD, a number can also be an object with an @value:

{"radius": {"@value": 1.21}}

This level of indirection gives us a place to put other metadata about the value.

The simplest possible approach would be to continue to treat the value as a literal, and only introduce @type:

{"radius": {"@value": 1.21, "@type": "meter"}}

Hooray, we've picked a type. But we've made up our own name for it. Where do we look the values up? What about derived units, domain, preferred display units, etc.?

Here's what it could look like by utilizing the UN/CEFACT codes, which have been notionally adopted by schema.org, a large driver of linked data adoption:

{
  "radius": {
     "@context":  "http://schema.org/",
     "@type": "QuantitativeValue",
     "value": 1.21,
     "unitCode": "MTR"
  }
}

While we have said more explicitly (i.e. not unilaterally) that this value is of a type, and are playing by the rules of a standards body, this is somewhat unsatisfying:

To solve some of these issues, adoption of the QUDT vocabularies would provide a more robust conceptual model:

{
  "radius": {
    "@context": [
      "http://schema.org/",
      {
        "ex": "http://example.com#",
        "radius": "ex:radius",
        "unit": "http://qudt.org/1.1/vocab/unit#"
      }
    ],
     "@type": "QuantitativeValue",
     "value": 1.21,
     "unitCode": "unit:Meter"
  }
}

This adds a "thing not a string" to the unitCode, itself which can be traced back to a robust set of models. QUDT can also support vectors of exponentiated dimension types, etc.and comes with a very large library, written by and used within an organization with a seriously multi-scale perspective (NASA).

Data Shapes

A whole other story. JSON-LD is pretty bad at labeling columns of arrays, and indeed URIs can't start with numerals. Some approach for listing columns and their types would be necessary.

Implementation

TBD... probably something like ipywidgets, i.e. numtraits.widget_serialization, which exposed a to_json and from_json functions that consulted the canonical data format and called the appropriate things in the upstream unit library (i.e. astropy, pint).

Dependencies

Generating and interpreting JSON-LD requires no additional libraries. A JSON Schema library (which already ships with jupyter) would be sufficient to provide sufficient serialization robustness, even if it couldn't do full type-checking of the resources.

The canonical lists are available for download as XML or turtle, and these could be converted to canonical JSON.

Front-end

Out of scope for this issue, but... in the near term, a set of base widgets (sliders, text boxes, etc.) which didn't simply fall over would be a good start.

As to serious implementations on the front-end parsing side of this, several quantity libraries exist including math.js and quantities.js. There is nothing as flexible as any of the python implementations, but this could be an excellent driver for the creation of such a library, driven by a canonical representation format.

Related:

westurner commented 7 years ago

See:

Data Shapes