Closed SJagodzinski closed 2 years ago
Tag Library Text:
Summary: A required child element of <control>
that identifies the institution or service responsible for the EAC-CPF instance.
May contain: <agencyCode>
(0..1), <agencyName>
(1..n), <descriptiveNote>
(0..1), <otherAgencyCode>
(0..n)
May occur within: <control>
Description and Usage:
Information about the institution or service responsible for the creation, maintenance, and/or dissemination of the EAC-CPF instance.
<maintenanceAgency>
must include a child <agencyName>
to provide the name of the institution or service. It is recommended to include the optional <agencyCode>
and/or <otheragencyCode>
children to unambiguously identify the institution or service. Any general information about the institution in relation to the EAC-CPF instance may be given in <descriptiveNote>
.
Attributes:
@audience
- optional (values limited to: external, internal)
@countryCode
- optional
@id
- optional
@languageOfElement
- optional
@scriptOfElement
- optional
Attribute Usage: Use @countryCode
to indicate a unique code for the country of the maintenance agency.
Availability: Required, not repeatable
@fordmadox : To avoid an empty mandatory wrapper element in <control>
, I suggest to make one of the elements <agencyCode>
or <agencyName>
mandatory by using xs:choice.
Attention: <agencyCode>
must not be repeatable, we use <otherAgencyCode>
for that purpose, whereas <agencyName>
is repeatable for multilingualism.
Isn't <agencyName>
mandatory already? Or is the suggested change mainly about having either <agencyName>
or <agencyCode>
as mandatory sub-element? Just to clarify so that I can keep track accordingly for EAD.
<agencyName>
is currently required in both EAC and EAD3, so <maintenanceAgency>
should never be an empty element right now. If the suggestion is to make it possible to supply an <agencyCode>
in place of the agencyName, however, can you add a new issue for that @SJagodzinski ?
Schema team, schema tests:
I've tested the draft XSD and RNG schemas for EAC-CPF 2.0 with regard to "maintenanceAgency" and can confirm:
<maintenanceAgency>
itself is required in order to have a valid EAC-CPF 2.0 file in both XSD and RNG<maintenanceAgency>
itself cannot be repeated<maintenanceAgency>
allows for the optional attributes @audience
, @id
, @countryCode
, @languageOfElement
, @scriptOfElement
, @valueURI
, @vocabularySource
, and @vocabularySourceURI
@countryCode
, though I am not sure if this is supposed to be done via Schematron. See the dedicated @countryCode
issue for more (https://github.com/SAA-SDT/eac-cpf-schema/issues/159). <maintenanceAgency>
also allows for the inclusion of attributes from other namespaces - I've tested with some XLink attributes, which validated fine in XSD and RNGWith regard to the sub-elements <maintenanceAgency>
, however, the two schema variants are using different definitions at the moment:
<agencyName>
is required and can be repeated<agencyCode>
is optional and cannot be repeated<otherAgencyCode>
is optional and can be repeated<descriptiveNote>
is optional and cannot be repeated<agencyName>
is optional and can be repeated<agencyCode>
is optional and can be repeated<otherAgencyCode>
is optional and can be repeated<descriptiveNote>
is optional and cannot be repeatedThe source file for the "control" module (https://github.com/SAA-SDT/eac-cpf-schema/blob/development/source/modules/control.rng) seems to be fine, i.e. following the first definition, so I'm not sure where the bug gets in with regard to the XSD. Maybe something in the build (https://github.com/SAA-SDT/eac-cpf-schema/tree/development/build)? For now, I have created a pull request to fix the XSD in order for it to match the RNG: https://github.com/SAA-SDT/eac-cpf-schema/pull/164
Retested and can now confirm that the occurrence of sub-elements in <maintenanceAgency>
is now correct in both, XSD and RNG, i.e.
<agencyName>
is required and can be repeated<agencyCode>
is optional and cannot be repeated<otherAgencyCode>
is optional and can be repeated<descriptiveNote>
is optional and cannot be repeatedSchema tests have now been successful in all aspects.
Pending question for EAC-CPF team regarding the sequence of sub-elements, which currently prescribes the optional <agencyCode>
and <otherAgencyCode>
to appear before the required <agencyName>
.
Pending question for EAC-CPF team regarding the sequence of sub-elements, which currently prescribes the optional
<agencyCode>
and<otherAgencyCode>
to appear before the required<agencyName>
.
I think we agreed to follow @fordmadox proposal to avoid a given order for child elements in EAS.
Cardinality change as a result of the latest change in #88.
I suggest to provide a choice having either <agencyName>
or <agencyCode>
available. If only one of the elements is available, it has to have content.
@fordmadox - this will need redoing in the schema. Please let me know, when you've had time to implement the change, and I'll re-test.
@SJagodzinski @kerstarno : if I understand correctly, the current proposal is:
Is that right?
As an aside: is there any need to keep an element like "otherAgencyCode", or couldn't we just allow "agencyCode" to repeat? I feel the same way about the recordId/otherRecordId distinction, but I'm especially interested in this right now since we are modeling a choice between an non-repeatable element and a repeatable element, which I think just makes things confusing for users. Why no otherAgencyName element, for instance? (which would be a horrible addition, so I'm not suggesting such an element :smile:)
@SJagodzinski @kerstarno : if I understand correctly, the current proposal is:
- maintenanceAgency is a required element that cannot repeat.
- it must contain either an agencyCode element OR an agencyName element.
- those two elements should always contain some text when they are present.
Is that right?
Yes to the first two aspects, but If I understood @SJagodzinski correctly, the idea would be to only require content if only one of the elements is available, i.e. the following would be valid without requiring content in <agencyCode>
:
<maintenanceAgency>
<agencyCode/>
<agencyName>Archives of Wonderland</agencyName>
</maintenanceAgency>
while one would need to have content in <agencyCode>
if it were used by itself, i.e.
<maintenanceAgency>
<agencyCode>DE-1958</agencyCode>
</maintenanceAgency>
As an aside: is there any need to keep an element like "otherAgencyCode", or couldn't we just allow "agencyCode" to repeat? I feel the same way about the recordId/otherRecordId distinction, but I'm especially interested in this right now since we are modeling a choice between an non-repeatable element and a repeatable element, which I think just makes things confusing for users. Why no otherAgencyName element, for instance? (which would be a horrible addition, so I'm not suggesting such an element 😄)
As for the aside:
For <agencyCode>
, it is currently recommended that this follows the standard defined in @repositoryEncoding
, ideally ISO 15511. This is to ensure, that an EAS instance ideally includes a globally unique identifier of the institution responsible for its creation. <otherAgencyCode>
can have any format, same as <agencyName>
can have any format, which is why these are repeatable, even if it might only be the minority of cases that actually make use of repeating the <agencyName>
or adding an <otherAgencyCode>
. Though for the latter, I know of various cases in the context of Archives Portal Europe where it is a welcome option to also provide some national identifiers for the institutions along with the ISO 15511 one.
Such a construction can work in RNG, but I don't believe there is any way to do that in XSD. See: https://www.w3.org/TR/xmlschema-1/#cos-element-consistent
Also, I'd say that's a good rule in general, despite the flexibility of RNG. In other words, we should not define an element with the same name and give it a different content model (e.g. in one instance it requires text, and in another it does not).
Not sure, what you mean: where would we have an element with the same name used with different content models in different contexts?
In your example above, if you're saying that "agencyCode" can be empty in the first example, but that it must have a text node in the second example, then I believe that would violate the "Element Declarations Consistent" constraint in XSD.
Ah, ok. Now I see.
Well, this was mainly my interpretation of @SJagodzinski's suggestion. Personally, I'd be fine in saying: "whenever you use either <agencyCode>
or <agencyName>
, they cannot be empty, and you have to at least use one of them."
I.e.
<maintenanceAgency>
<agencyCode>WL-111</agencyCode>
<agencyName>Archives of Wonderland</agencyName>
</maintenanceAgency>
or
<maintenanceAgency>
<agencyCode>WL-111</agencyCode>
</maintenanceAgency>
or
<maintenanceAgency>
<agencyName>Archives of Wonderland</agencyName>
</maintenanceAgency>
plus, if we would go with the <part>
approach for <agencyName>
,
<maintenanceAgency>
<agencyName>-</agencyName>
</maintenanceAgency>
I think that's the best we can do with the XSD, if we want to enforce that one is present and non-empty. We could instead have a rule in the Schematron, but it would amount to the same thing, I suspect.
Anyhow, for testing, @SJagodzinski and @kerstarno , I've updated the base schemas so that:
If this approach is agreeable, then I can update the conversion process and re-generate the EAC 2 sample files: https://github.com/SAA-SDT/eac1-to-eac2-conversion/tree/main/sample-files/output (these files are all invalid now, but very easy to re-generate them if this approach works)
If this approach is not agreeable, just let me know what we should explore next. But we cannot allow the same element to be empty in some instances but be forced to have a text node in another instance, at least not in the XSD schema, although we could add such a rule to the Schematron if that approach is preferred. That said, I think that such a variable approach would be more confusing than the other options.
Re-tested:
<maintenanceAgency>
is required again<maintenanceAgency>
has indeed been moved up and is now expected before <maintenanceHistory>
<maintenanceAgency>
requires at least either <agencyCode>
or <agencyName>
to be present<agencyCode>
and <agencyName>
, cannot be empty and require at least one non-whitespace character<agencyCode>
always has to be the first sub-element of <maintenanceAgency>
<agencyName>
, however, the sequence has gotten more flexible again:
<agencyCode>
, one <agencyName>
has to come first, i.e. before <otherAgencyCode>
and <descriptiveNote>
; any repeated <agencyName>
-s can then also appear after <otherAgencyCode>
, though<agencyCode>
, <agencyName>
has to come after <agencyCode>
, but it can again be used before AND after <otherAgencyCode>
at the same time, i.e. the following would be possible:
<maintenanceAgency>
<agencyCode>WL-111</agencyCode>
<agencyName>Archives of Wonderland</agencyName>
<otherAgencyCode status="authorized">111</otherAgencyCode>
<otherAgencyCode>AOW</otherAgencyCode>
<agencyName>Archives du pays des merveilles</agencyName>
<agencyName>Archiv des Wunderlandes</agencyName>
<descriptiveNote>
<p>[Some further text...]</p>
</descriptiveNote>
</maintenanceAgency>
If that's ok, the test against the updated schema is successful.
The test results above apply to both schemas, RNG and XSD.
(@fordmadox and @SJagodzinski I'll update #87 and #88 accordingly, once the above is confirmed.)
Regarding that last point: we can easily enforce the order so that agencyName and otherAgencyCode cannot be mixed. So, just to be clear, that choice is not made for us by the XSD-serialization process. I added it that way since that it is how it was defined in the previous draft branch approach (and since that type of flexibility is still possible in the XSD), but it's just as easy to have a strict order for all 4 possible child elements of maintenanceAgency if that is desired. I'll make a new push now for that option.
Just let me know which of the following two options is preferred, and I will keep that whichever commit in the base directory.
Allow agencyName and otherAgencyCode to be interleaved example: https://github.com/SAA-SDT/eac-cpf-schema/tree/9e8c636632dac7d8016ba784fa4bf3b8e4bf9c5a/xml-schemas/eac-cpf
Require a strict order for all children elements of maintenanceAgency examples: https://github.com/SAA-SDT/eac-cpf-schema/tree/006f02026763bce62fbf1b50f12d273271cf58a3/xml-schemas/eac-cpf
@fordmadox
[...] I've updated the base schemas so that:
* maintenanceAgency is required again;
Right
* since it is required and not optional, the order has been moved up (and it occurs before maintenanceHistory, just like in the current EAC);
Right
* the maintenanceAgency element _must_ contain at least one agencyName OR one agencyCode element.
Right
* agencyName and agencyCode are now defined as non-empty elements.
Well... I don't think is approach is userfriendly but seeing the xsd limitation, of course I agree. My idea was indeed, to force content if there is only one of the elements. Eg it would be fine to have an agency code but an empty agency name.
Here's a third approach to test: https://github.com/SAA-SDT/eac-cpf-schema/tree/006f02026763bce62fbf1b50f12d273271cf58a3/xml-schemas/eac-cpf https://github.com/SAA-SDT/eac-cpf-schema/tree/93f6ed6401c9fa85fa8fed5bb440cec47a224761/xml-schemas/eac-cpf. (update: when I copied the link previously, it hadn't updated in my clipboard, so it was the wrong one!)
Both agencyCode (1..1) and agencyName (1..n) are required, but like any XML element required by default, they can be empty (so, no problems with the migration process). That way, a template will include both elements by default, and users won't have to go through the process of choosing one or the other just to get a valid file (although that process is still required for multipleIdentities vs. cpfDescription). Then, we can add something to the Schematron that will check to make sure that at least one of those elements has some content.
Apologies, but I'm setting this back to "Schema" as we currently have three different versions of the schema and need to confirm if they work as intended and which will be the final one:
<agencyCode>
or <agencyName>
to be present, but require both to have at least one non-whitespace character<agencyCode>
has to appear before <agencyName>
<agencyName>
and <otherAgencyCode>
to be interleaved<agencyCode>
AND <agencyName>
must both be present, but can be left empty @fordmadox - I've assumed that option 3 is what's currently in the development branch, as the link posted in the previous comment is the same link as for option 2, plus the last update for development branch seems to match). Also, the above only applies to the RNG schema of option 3. The XSD schema of option 3 is exactly as option 2. (updated following the update in the previous comment)
Not 100% sure, but I'm assuming that option 3 (with the addition of a Schematron rule to check that at least one of <agencyCode>
or <agencyName>
has content) might closest to what @SJagodzinski would like to see.
Should that eventually be the chosen option, I just would like to point out that - instead of either making a required, but empty element optional or making its requirement clearer by making content mandatory (as we now have in <recordId>
and <part>
) - we would actually be adding yet another required element that can be left empty.
For option 4, how about this:
There are, of course, many more options, but we should settle on one before/by tomorrow so that we can ready everything in time for the Call for Comments.
Option 4 can be tested here: https://github.com/SAA-SDT/eac-cpf-schema/tree/0faf1b2a8e2019e21d0c1d82ba7e3cb5ffdbd3e8/xml-schemas/eac-cpf. (note the commit hash in the URI, but it can also be tested via the standard 'development' branch URL currently). The only thing that cannot be tested here is the Schematron part, but outside of that, I believe this option should align with what @SJagodzinski described previously.
All that said, I think that it would be easiest if we continued to say that agencyName was required (as in EAC 1), and allowed it to be empty. We could then add a Schematron rule to ensure that either an agencyName or an agencyCode was available that had text. That way, we could still make agencyName first in the list, since it would always be required (even if empty), and agencyCode would be optional.
If only we would have thought of the Schematron option during the last Schema Team meeting when we started this conversation about whether it made sense to require elements which then can be left empty and before posting the Schema Team's suggestion here on GitHub. ;-)
With all these Schematron checks, just to confirm: will it still be possible to use the EAC-CPF 2.0 schema without Schematron?
Also, would we reconsider the decision to require content via the Schema for <recordId>
and <part>
and doing these via Schematron as well?
Using the XSD Schema without the Schematron is no problem but all validation handled in the Schematron won't be run.
Moving the check of this to be only in Schematron might then mean it wont be checked if you havent implemented the use of Schematron since with the XSD you need to add that, in the RNG it is enforced in the schema itself.
Re-tested option 4 (https://github.com/SAA-SDT/eac-cpf-schema/tree/0faf1b2a8e2019e21d0c1d82ba7e3cb5ffdbd3e8/xml-schemas/eac-cpf) and can confirm:
<maintenanceAgency>
is required and is expected before <maintenanceHistory>
<maintenanceAgency>
requires either <agencyCode>
or <agencyName>
<agencyCode>
and <agencyName>
can be left empty, i.e. <maintenanceAgency><agencyCode/></maintenanceAgency>
or <maintenanceAgency><agencyName/></maintenanceAgency>
is allowed<maintenanceAgency>
is <agencyCode>
, <agencyName>
, <otherAgencyCode>
, <descriptiveNote>
The above applies to both schemas, RNG and XSD.
Using the XSD Schema without the Schematron is no problem but all validation handled in the Schematron won't be run.
Moving the check of this to be only in Schematron might then mean it wont be checked if you havent implemented the use of Schematron since with the XSD you need to add that, in the RNG it is enforced in the schema itself.
Thanks, Karin. I was mainly wondering about the requirements with regard to elements or attributes not being empty. These checks, at least not the ones that we are talking about in these currently remaining issues, are neither defined in the XSD nor in the RNG schema per se. If that's ok, that's ok. Just making sure.
In both XSD and RNG there are possible to set rules making sure the element when used is not empty. Which also can be used in the Schematron.
In both XSD and RNG there are possible to set rules making sure the element when used is not empty. Which also can be used in the Schematron.
I was mainly referring to what is or is not present in the most recent version of the development schemas, not the general possibilities :-)
@SJagodzinski we still have four possible solutions in the schema for this one at the moment. Could you please confirm whether option 4 (https://github.com/SAA-SDT/eac-cpf-schema/issues/86#issuecomment-767611723) is the preferred way to go? Thanks very much in advance.
Option 4 has been confirmed during the EAC Team meeting on 5 February.
Thereby, this element is ready schema-wise.
Maintenance Agency
@audience
@countryCode
@languageOfElement
@scriptOfElement
@valueURI
@vocabularySource
@vocabularySourceURI
Creator of issue
Related issues / documents
<conventionDeclaration>
: add sub-elements #67EAD3 Reconciliation
Additional EAD 3 attributes
@altrender
- Optional@audience
- Optional (values limited to: external, internal)@countrycode
- Optional@encodinganalog
- Optional@lang
- Optional@script
- OptionalContext
The institution or service responsible for the creation, maintenance, and/or dissemination of the EAC-CPF instance.
May contain:
<agencyCode>
,<agencyName>
,<descriptiveNote>
,<otherAgencyCode>
May occur within:<control>
Attributes:@xml:id
- Optional Availability: Mandatory, Non-repeatableSolution documentation
Rephrasing Summary, Description and Usage and Attribute usage needed?
May contain:
<agencyCode>
,<agencyName>
,<descriptiveNote>
,<otherAgencyCode>
May occur within:<control>
Attributes:@audience
- optional (values limited to: external, internal)@countryCode
- optional@id
- optional@languageOfElement
- optional@scriptOfElement
- optional@valueURI
- optional@vocabularySource
- optional@vocabularySourceURI
- optionalAvailability: Required, not repeatable
Example encoding