SAA-SDT / eac-cpf-schema

https://eac.staatsbibliothek-berlin.de/
10 stars 4 forks source link

@xlink:href #247

Closed SJagodzinski closed 2 years ago

SJagodzinski commented 3 years ago

Xlink: HREF

Agreed in EAC-CPF and in EAD team to remove this attribute to simplify the schema, cf Berlin meeting:

Use @linkrole, @linktitle, and @href, from former XLink attributes and drop @actuate, @arcrole, @show.

Creator of issue

  1. Silke Jagodzinski
  2. TS-EAS: EAC-CPF subgroup
  3. silkejagodzinski@gmail.com

Related issues / documents

EAD3 Reconciliation

hypertext Reference

Summary: The locator for a remote resource in a link. When linking to an external file, href takes the form of a Uniform Resource Identifier (URI). If the value is not in the form of a URI, the locator is assumed to be within the document that contains the linking element.

Data Type: token

Context

Summary: Contains a URI, possibly relative, pointing to the related resource.

Description and Usage: The address for a remote resource. The xlink:href takes the form of a Uniform Resource Identifier (URI). While it is permissible to use a relative URI, and an absolute URI is recommended.

Data Type: anyURI

Solution documentation:

Rephrasing Summary and Description and Usage needed

May occur within: <contactLine>, <reference>, <representation>, <source> Data Type: anyURI

ailie-s commented 3 years ago

What is the data type for @href? EAD3 uses token and EAC-CPF uses anyURI

kerstarno commented 3 years ago

Had a quick look through the EAD3 repository to see if there was anything that provided background to the use of token for @href in EAD3, but couldn't find anything.

The description of this issue here says: "change data type from anyURI to token", while the solution documentation says "date type: anyURI".

Can't remember if we specifically spoke about this, but it might be that in this case actually sticking with anyURI (and changing this accordingly for EAD during the revision) could make more sense.

SJagodzinski commented 3 years ago

As we adopted @href from EAD, we are going to use token.

Might be changed during the Call for Comments period, if necessary.

fordmadox commented 3 years ago

@SJagodzinski I would prefer to keep those defined as "anyURI" instead, and not follow what EAD3 did.

This keeps us in line with other standards like TEI, XHTML, etc.

fordmadox commented 3 years ago

@tcatapano: Do you recall why EAD3 went with "token" when defining @href? (e.g. https://github.com/SAA-SDT/EAD3/blob/814ff500b9e866962a98e9d68f876dd9ad2c9988/redesign/ead_revised_defs.rng#L2733-L2737).

SJagodzinski commented 3 years ago

@fordmadox : Due to alignment with EAD3. I assume that EAD revision team had an idea behind this data type.

We can discuss these details during the Call for Comments period, but to finalise the draft for the Call for Comments, I try to freeze a status. No time and need to discuss the attribute data types at this moment, I think.

fordmadox commented 3 years ago

Let's keep the @href datatype as anyURI for the time being, then. I've no clue why it should differ in datatypes from the other attributes, like valueURI, etc. (I have already changed localType to 'token', though, as requested, which aligns things with EAD3, and should not impact the migration process, I don't expect).

ailie-s commented 3 years ago

Tag Library Text:

Summary: The address for a remote resource. @href takes the form of a Uniform Resource Identifier (URI). Data type: anyURI

kerstarno commented 3 years ago

Tested as part of Schema Team's schema testing:

The above applies to both schemas, RNG and XSD.

The attribute is ready.

tcatapano commented 3 years ago

@tcatapano: Do you recall why EAD3 went with "token" when defining @href? (e.g. https://github.com/SAA-SDT/EAD3/blob/814ff500b9e866962a98e9d68f876dd9ad2c9988/redesign/ead_revised_defs.rng#L2733-L2737).

@fordmadox: IIRC -- and @rockivist could confirm or refute -- the rationale was for the general schema to err on the side of permissiveness while allowing stricter validation through subsetting. In the case of @href I think there was some concern that not all actually existing, in the wild URL's will pass anyURI validation (especially if copied and pasted without subsequent url encoding). For example, one comes across DOI urls overloaded with metadata and offending characters from time to time. See: https://doi.org/10.3161/1733-5329(2007)9[161:LTDSDO]2.0.CO;2 (see also other articles for this journal: https://bioone.org/journals/acta-chiropterologica/volume-9/issue-1)

I did run a test against 79K+ @href values from <dao>s extracted from the Archivegrid corpus circa 2014 and none failed XSD validation as anyURI. So whatever the rationale, in practice it seems like using the datatype is very unlikely to cause many problems.

Hope this helps.

rockivist commented 3 years ago

@tcatapano When @fordmadox asked me about this I couldn't recall the details, but now that you say it, I'm certain that your recollection of the justification for using token rather than anyURI for href in EAD3.

FWIW I think changing to anyURI would be fine.