Open ronaldtse opened 3 years ago
For example:
"ISO/IEC/IEEE 24765:2010, Systems and software engineering – Vocabulary, 3.234 (2), modified by omission of "type of" or other relevant words and addition of Notes 1 and 2”
Using the semantic approach I don’t think we need to maintain the document title since it is obtainable from the referred document itself.
We should creating the following structure to accommodate this:
Sought clarification from IEC.
From IEC:
It does not matter now, but [...] would use different terms for what I have put in blue.
We will have to change the terms of "document", "clause", "relationship" and "modification" later on.
On the contrary, ISO 10241-1:2011 contains some examples of source references with titles
although I agree that for a standard, one would not normally include the title as described in ISO 10241-1:2011, 6.8, but this is only a recommendation: “The indication of the source should be in coded form and a link or reference to a standard bibliographic description provided.”
Meanwhile, since the rules do not prohibit the inclusion of a title, we should not either.
Maybe you should allow for a short form (without a title) and a long form (with a title) of an xref, where the short form is the default?
So we should allow for a short form (without title) and also a long form (with title, even in the case of a standard), depending on user preference.
Technically, we should allow entering references in ISO 690 format because ISO 10241-1 accepts only the ISO 690 bibliographic format...
This needs to be dealt with in the concept-model and Glossarist.
By the way are you aware of the following rule:
This rule applies when a SOURCE only applies to a single language term. We will need to deal with it in Glossarist.
@skalee this means we need to retain the original title in these entries in the resulting data file and rendering. Can you help with this? Thanks.
@ronaldtse The question is how to detect that title. Anything between <i>
and </i>
, perhaps? Or anything that is not ref nor clause nor modification comment, perhaps? The latter may be polluted with some additional text.
Please also note that we do that already to some degree, as the original
field contains unparsed SOURCE column value. For example:
eng:
id: 351-57-01
authoritative_source:
- ref: ISO/IEC Guide 51:1999
clause: '3.5'
link: https://www.iso.org/standard/32893.html
relationship:
type: identical
original: ISO/IEC Guide 51:1999, <i>Safety aspects – Guidelines for their inclusion
in standards</i>, 3.5
@skalee while <i>
occurs in more places in the SOURCE than just titles (it is also used for symbols inside the "modified" note), but if we just select all <i>...</i>
that contains length more than 3 it should work.
I don't know how many document titles are provided are not enclosed in <i>
though.
Let's take a narrow approach that we only extract document titles, not the "anything that is not x or y" approach. Thanks!
@skalee how's this issue going? To move forward let's find a list of "original: {docidentifier}, ... , {clause number}" and find out what the ...
is. Then we can extract the title, probably as ref_title:
.
I don't think we're supposed to have document titles inside the SOURCE field.