relaton / relaton-data-nist

2 stars 0 forks source link

`edition` contains a string, must be an Edition instance #4

Closed strogonoff closed 2 years ago

strogonoff commented 2 years ago

Relaton-py expects Edition instance (per LutaML spec).

However, we are given a string, example:

https://github.com/relaton/relaton-data-nist/blob/8d32de7b7fad7a3536ab2caab56b1d4afc70247a/data/NBS_RPT_3484.yaml#L57-L59

(This breaks data import by BibXML service.)

opoudjis commented 2 years ago

Edition in the LutaML spec is a mandatory string describing the edition plus an optional number, giving the numeric value

the latter was intended for cases where we were preserving the prose description of a new edition, e.g. "Revised and augmented edition", and we wanted a number to indicate that, in reality, this is edition #2

if a string is provided, no problem: just do not provide the numeric value. By default we don't; no grammar change needed or desired.

There is an issue here with XML vs JSON. This was conceived of as an XML grammar, which has a notion of values and secondary attributes for elements, so the attributes are ignorable. if you take the JSON view, all of a sudden it's a string turning into a tuple, {value: "Revised and augmented edition", numeric: 2} True. But in everything Andrej has been doing, he's just been putting a number into the value slot, and everything's been ok.

In general, there is a lot of type coercion needed to deal with these, and I think @andrew2net does some of that already in processing Hashes as input. If a string is presented as the edition, it needs to be coerced into an Edition instance. This happens ALL THE TIME in Relaton, because the XML model (the attributes are optional and associated with a primary string) really is how humans think of these values. @strogonoff consult with @andrew2net on what is involved here, he should already have processing defaults in place.

andrew2net commented 2 years ago

fixed in -v 1.12.0