Closed naoki-kokaze closed 5 years ago
We've now prepared the new element <unit>
as discussed in #1461 so that will be available with the next release. Notes:
As agreed by Council, the new element <unit>
takes the @unit
attribute (instead of using @norm
as originally proposed). The @unit
attribute serves the same purpose we've discussed here for @norm
and it's coming from att.measurement
so as Council agreed for that ticket, it's better suited for an element associated with measurements than @norm
coming from att.lexicographic
. <unit>
is also a member of att.typed
and att.global
.
Next steps for this ticket:
1) Following @naoki-kokaze 's proposed specs, we'd need to add @factor
to <unit>
as a special attribute (not a member of an existing attribute class). Are we agreed on this?
2) We need to create spec files for <unitDef>
and <unitDecl>
. My question for our group is this: Can we reconsider whether we need <unitDef>
to be a member of att.lexicographic
now that we've gone with @unit
on the <unit>
element?
@peterstadler marked this ticket as a "release blocker" because we do want to implement it soon, but I wonder if we want to look at the new <unit>
element first and review these questions before proceeding to release?
We're also going to need to do some intensive writing for the Guidelines to introduce <unitDef>
and <unitDecl>
, and I think we should adapt @naoki-kokaze 's use-case from the 19th-century log-book of the Japanese steam ship company. There's lots to write here and a serious multi-cultural perspective to be described--important material for the Guidelines, but we may not have time to get all this in for this July release of the Guidelines (for which contributions are needed within the day). I think what I can do is simply add @factor
to <unit>
for now, and start working on new material in a branch which I'll ask us to review together.
Sorry! As I review the examples, I see that was @duncdrum 's use-case (the logbook from the Japanese steam ship company).
I've just created a measurement
branch on this repo to push new material for the specs and Guidelines connected to this ticket.
@ebeshero right, since this is @naoki-kokaze 's baby I d say he gets to call dips on what examples should go into the guidelines. @jamescummings @naoki-kokaze catties 🍺 are collectible at the annual conference in Tokyo
@naoki-kokaze @duncdrum @jamescummings @sydb @emylonas I've reset the milestone for this to the release after this one, because we have some writing and testing to do yet (it's too soon for the release of next week).
However, I've also started a measurements
branch here on the TEI GitHub repo, where I've set up the specs files for <unitDecl>
and <unitDef>
and added the @factor
to the new <unit>
element. I've also added a new section to the Guidelines TEI Header chapter under the encodingDesc for the new <unitDecl>
element, and I've added some information there to get us started. Here's what I've done so far: https://github.com/TEIC/TEI/commit/c9efa4928051fdef9b8542ccde0d69473a03ba84
I think it will need more and better writing, and contributions from more of us, but I think that will be good to work on as we're heading to Tokyo in September. I'd like to step aside from this for now and ask others to jump in and work on revising and expanding this--in particular to work up some examples!
I have not looked at the issue carefully, but I doubt that it makes sense to put any of these elements into att.lexicographic. (That doesn’t mean they don’t need @norm
, but even if they do, they probably don’t need all the others.)
@ebeshero Thank you for the arrangement towards the next release.
I agree with @sydb ‘s opinion about att.lexicographic, because the reason why I put the element
Just for the record, I think we do need such a generic mechanisms as @norm
for coded segments (to differentiate from full natural language segments) and I am not completely at ease to have too many ad hoc ones to cover more or less the same usage pattern. Time pressure has probably prevented a debate on this (and I am so glad we have <unit>
!), but we should re-open the discussion on this attribute at some point.
@laurentromary When we reviewed this together in Council it made sense in the measurement context to work with att.measuremement
, and use @unit
for the purpose of a normalized value for the <unit>
element. As I began work on this ticket, I wondered if <unitDef>
might be a different case, if it’s here we might want the lexicographic toolbox?
@naoki-kokaze and all: For the Open Council meeting this morning at the Tokyo TEI 2018 conference, we've prepared some slides to introduce TEI Council, and to summarize our work so far on this issue: Take a look here: http://bit.ly/TEI-tc
Naoki, the part about summarizing the ticket work so far is really designed to follow up on anything you'd like to say to introduce the need for better encoding of measurements in the TEI! Council as a whole needs a briefing on all the work we've done so it's easy for us to see what to do next, so that's what I tried to do here with the last three slides.
Hi @ebeshero . Lovely slides! The only thing that struck me was "Say you've found something wrong with the TEI", which suggests that all tickets are bug reports; some of course (including @naoki-kokaze 's ticket) are feature requests / enhancements.
Thanks @martindholmes ! Yeah, that's an artifact from the old slideset we presented in Vienna...I just changed it with the word "change"! :-)
Thank you, Elisa! I have prepared some slides to share the gist of my proposal. https://docs.google.com/presentation/d/1koR84Q0AsHHdYd_kfQNSyqS34kWq7DajvT8p_iqvis8
But there is an omission about att.lexicographic, so let's review that point later!
@naoki-kokaze
I believe where you use <place key="#england"/>
, you really want to do <placeName ref="#england"/>
.
(place is the container for the place-related metadata that you want to point at. It is usually placeName that does the pointing (though it could be ptr or something else more general placeName makes it clear what you are pointing at).)
Summary of New Actions decided in Council F2F meeting in Tokyo 2018:
@factor
to @formula
and give it its own attribute class because this will likely have broader use.<conversion>
element, and give it special @fromUnit
and @toUnit
. (These new attributes do not belong in a class, and are understood to belong specifically to the new <conversion>
element.) The new <conversion>
element will behave differently if it's positioned inside and outside the <unitDef>
.
<conversion fromUnit="#unitDef_A" toUnit="#unitDef_B" formula="XPath"/>
(If inside the <unit>
element, don't allow @fromUnit
) @ebeshero Thank you for the minuting the discussions at the Open Session on 10th Sep! I'd like to offer the example markup based on the discussions. I would like all of you to check it and to have any comments or feedbacks.
<encodinDesc>
<unitDecl>
<unitDef xml:id="keel" type="weight">
<label>keel</label>
<placeName ref="#england"/>
<conversion fromUnit="#chalder" toUnit="#keel" formula="20" from="1421" to="1676"/>
<conversion fromUnit="#chalder" toUnit="#keel" formula="16" from="1676" to="1824"/>
<desc>Keel was a unit measuring weight of coal. It had been equal to 20 chalders from 1421 to 1676, and it was made to be equivalent to 16 chalders from 1676 to 1824.</desc>
</unitDef>
<unitDef xml:id="chalder" type="weight">
<label>chalder</label>
<placeName ref="#england"/>
<conversion fromUnit="#bushel" toUnit="#chalder" formula="32" from="1421" to="1676"/>
<conversion fromUnit="#bushel" toUnit="#chalder" formula="36" from="1676" to="1824"/>
<desc>Chalder was a unit measuring weight of coal. It had been equal to 32 bushels from 1421 to 1676, and it was made to be equivalent to 36 bushels from 1676 to 1824.</desc>
</unitDef>
<unitDef xml:id="bushel" type="weight">
<label>bushel</label>
<placeName ref="#england"/>
<desc>Bushel was a unit measuring weight of coal.</desc>
</unitDef>
</unitDecl>
</encodingDesc>
@naoki-kokaze This looks great. Recalling our previous discussion, in this particular example, you might omit @fromUnit
because the context <unitDef>
element provides that information, but we may decide that's just confusing.
Also, if the @formula
is XPath, the expressions would be " 32" and " 36" (with the operator for multiplication). But we need to describe exactly how to implement this; it might be clearer to use conventional variable names like this:
$fromUnit * 32
Council discussion: @sydb : We need to make really clear that the @formula
takes a value in @fromUnit
and converts it to @toUnit
(and not the other way around). (And be really clear to disambiguate @from
and @to
(which are dates) from @fromUnit
and @toUnit
.) We should present this is as a template for a function, rather than the function itself. Also make clear that values for unit conversion should be drawn from the @quantity
on <unit>
.
Council discussion: @sydb : We need to make really clear that the @formula
takes a value in @fromUnit
and converts it to @toUnit
(and not the other way around). (And be really clear to disambiguate @from
and @to
(which are dates) from @fromUnit
and @toUnit
.) We should present this is as a template for a function, rather than the function itself. Also make clear that values for unit conversion should be drawn from the @quantity
on <unit>
.
From Council teleconference of a few minutes ago.
<unitDef>
element defines units for use with <unit>
(and possibly <measure>
or other things that are in att.measurement);<unitDef>
to automatically convert stuff, you have to encode the information this way, not that way”;@quantity
to hold the quantity of the unit in question (i.e., the from unit);"@quantity"
in the value of @formula
to indicate where the processor should plug in the number of whatevers to be converted.May be a good idea; may be a bad idea. Please comment.
@sydb @quantity
(meaning the number of fromUnits to be converted) is found on <unit>
, not on <formula>
(where the formula resides), I think it might be a bit confusing to use @quantity
in the formula itself; I think "$quantity" might be better.
I just created a job on our Jenkins server for the measurement
branch: https://jenkins-paderborn.tei-c.org/job/TEIP5-branch-measurement/
Currently it's building, but when it's finished you can review the changes (made in the measurement
branch) to the Guidelines directly at https://jenkins-paderborn.tei-c.org/job/TEIP5-branch-measurement/lastSuccessfulBuild/artifact/P5/release/doc/tei-p5-doc/en/html/index.html
My Jinks has one too. Its build has been broken since I set it up, though.
Note to all involved here: I'm at last returning to work on implementing Council's decisions on this ticket, hopefully in time for our next release in July! I rebased the measurement branch to be sure it's up to date with dev at this moment. TEI-Jenkins is testing the branch and there's a longstanding issue with something generating duplicate files. I'm going to see if I can fix that first of all, and then continue working on @formula
and the conversion markup we developed here.
/var/jenkins_home/workspace/TEIP5-branch-measurement/P5/antbuildweb.xml:32:
Fatal error during transformation using /var/jenkins_home/workspace/TEIP5-branch-measurement/P5/Utilities/guidelines.xsl:
Cannot write more than one result document to the same URI: file:/var/jenkins_home/workspace/TEIP5-branch-measurement/P5/Guidelines-web/en/html/ref-model.labelLike.html;
SystemID: file:/var/jenkins_home/jobs/Stylesheets-dev/lastSuccessful/archive/dist/xml/tei/stylesheet/html/html_oddprocessing.xsl; Line#: 140; Column#: 170
In the last pair of commits I've been working on implementing the changes Council agreed on last fall and I'm watching Jenkins to see if our branch passes the right tests. Here's what I've done:
<unitDecl>
and <unitDef>
elements.<egXML>
for the elementSpecs and in the unitDecl section of the HD-Header chapter. att.formula
for @formula
.<conversion>
element, and gave it membership in att.formula
as well as att.datable
and att.global
.@fromUnit
and @toUnit
attributes to <conversion>
.@from
and @to
(from att.datable) to units of measurement for conversion.I'm going to need some help reviewing all the documentation and the details! And...I'm happy to report that our branch is not breaking the build--it's just behaving the same way the dev branch is doing now! I'm going to issue a pull request and a formal request for review.
For all following this ticket: We've been having extensive discussion on our branch's pull request (https://github.com/TEIC/TEI/pull/1892) as we'd like to try to complete the specs for <unitDecl>
and <unitDef>
and associated elements and attributes (<conversion>
and att.formula
). I'm going to summarize what we're working on right now so we have it here in the right place.
@formula
ought to be able to express a mathematical formula, like, say, the conversion between Fahrenheit and Celsius. Currently our examples are basically multiplication factors, in line with the original proposal. The @formula
tells you what factor to multiply the value in @fromUnit
to get to the value in @toUnit
. But we should be able to express combinations of addition, subtraction, multiplcation, and division, and to express that clearly we'd like to make the datatype of @formula
be teidata.xpath
.
We need to add an @unitRef
attribute to att.measurement
so it can be used on <unit>
and point to just one value only defined in a <unitDef>
up in the <teiHeader>
. There's some question about how @unitRef
would relate to @unit
and whether these should be mutually exclusive. @sydb @martindholmes: does that about capture it?
I updated the summary comment above to reflect that we'd probably want to add @unitRef
to att.measurement
.
@sydb @ebeshero This all looks good to me, except that I think we want to see @unitRef
available on <measure>
as well as <unit>
, don't we? It will get that automatically if the att is in att.measurement, of course. There are cases where there's nothing in the text to tag with a <unit>
element, but you still want to capture what unit is being used in the measurement expression.
In the measurement branch, I believe I have now implemented everything we've been discussing, and it's passing the build tests. Take a look at the measurement branch pull request now: https://github.com/TEIC/TEI/pull/1892
@naoki-kokaze can you post your source for the keel/chalder example and the Japanese examples? I can add that to the bibliography page for the Guidelines.
@ebeshero Thank you very much for all of your efforts to develop the new tag set!
The source for English measurement is: Zupko, Ronald Edward. 1977. British Weights & Measures: A History from Antiquity to the Seventeenth Century. Madison: University of Wisconsin Press, pp. 141–151. And the one for Japanese measurement is: (In Japanese) 大隅亜希子. 1996. “律令制下における権衡普及の実態: 海産物の貢納単位を中心として.” 史論 49. pp. 22–44. Available from http://id.nii.ac.jp/1632/00015761/. (Translated in English) Osumi, Akiko. 1996. “On the Popularization of Weights under the Ritsuryo Regime: Focusing on the Units for the Aquatic Products as Tributes.” Shiron (Historica) 49. pp. 22–44.
If you need any help, please let me know!
@naoki-kokaze I've just added your sources to the BIB and examples. Thank you! Perhaps we're now ready to close this ticket? I'm waiting for someone else on Council to merge the pull request (probably should not be me.)
I'm Naoki who gave a poster presentation at the TEI 2017 at UVic, on how to mark up UNITS that were not based on the metric system. It should be important for us to discuss measurement in a broad sense, because the problem on measurement implicates and represents cultural diversity. A Wikipedia article may help us to understand the importance of discussing cultural uniqueness through measurement.
And, some of the members of TEI have already discussed how useful \<unit> element is. Please see this https://github.com/TEIC/TEI/issues/1461
First of all, I would like to share my revised version of poster which omits the image of the historical source due to the license. TEI_2017_poster_for_github.pdf
The problem is that: how should it be marked up?
銅二千五百十六斤十両二分四銖
which means 'Copper whose weight is 2516Kin, 10Ryo, 2Bu and 4Shu'. It might be complicated to see what's happening, but you might understand it by considering some examples of British measurement, like 12yd, 2ft and 10in, which were also not based on the metric system.
Based on the discussion within the conference, there might be at least two solutions.
(i) Using only one \<measure> element:
<measure
commodity="銅" n="2516/10/2/4" unit="斤/両/分/銖" />In current scheme, we can't use \@quantity to store multiple values. Though, we can use any delimiter instead of using slash.
(ii) Using \<measure> element to nest the other \<measure> elements. ...Sorry, I have to get on the plane. Please see the poster file to find a second of the possible solutions!