SAA-SDT / EAD3

https://www.loc.gov/ead/index.html
Creative Commons Zero v1.0 Universal
81 stars 25 forks source link

Change <parallelphysdescset> to <physdescset> #450

Closed rockivist closed 9 years ago

rockivist commented 10 years ago

Following up on the recent discussion of changes to <parallelphysdescset> on the TS-EAD list, I propose the following changes (chapeau to @fordmadox, @kerstarno, @MicheleCombs, et al):

1) Rename <parallelphysdescset> as <physdescset>

A more generic approach to creating sets of physdesc statements will alleviate my concerns about carefully denoting which statements are parallel to which. And I don't think we lose much in terms of expressiveness. In practice most sets will be parallel statements. This is analogous to <daoset> in which we declined to make any semantic assertions beyond "these DAOs are a related set". Going back to DACS 2.5.7, I think I was wrong to fixate on the parallel concept in multiple statements of extent. It's the multiple part that we most need to accommodate. The simplest way to achieve that is with a generic <physdescset>.

2) Add @coverage as an optional attribute on <physdescset>

As suggested by Kerstin, I think this will be useful when a physdescset combines multiple part statements. Having @coverage on physdescset will allow one to make it clear if multiple parts add up to the whole of the unit being described or not.

3) Add @localtype as an optional attribute on <physdescset>

Since <physdescset> is a more generic grouping mechanism than <parallelphysdescset>, I think it would be correct to add @localtype so that users can specify the type of set. <physdescset localtype="parallel"> for instance.

4) Allow <physdescset> to contain as children two or more of <physdescstructured> or <physdesc>.

If we adopt a more generic <physdescset>, it will make sense to have that set accept as children both our structured and our unstructured physdesc elements.

MicheleCombs commented 10 years ago

Clarification on point 4: Can contain physdescstructured OR physdesc, but not both?

tcatapano commented 10 years ago

Proposed changes implemented in branch: https://github.com/SAA-SDT/EAD-Revision/tree/parallelphysdescset

rockivist commented 10 years ago

@MicheleCombs Both would be fine. Two or more of (physdescstructured or physdesc).

kerstarno-zz commented 10 years ago

Checking with Terry's implementation in branch, point 4 would however still allow for having physdesc and/or physdescstructured as direct subelements of did, right? And I - theoretically - could also have these repeated without grouping them in physdescset, although that surely would be recommended via the Tag Library, correct?

ruthtillman commented 10 years ago

Yes, the code allows for them directly in did.

On Tue, Sep 9, 2014 at 12:00 PM, Kerstin Arnold notifications@github.com wrote:

Checking with Terry's implementation in branch, point 4 would however still allow for having physdesc and/or physdescstructured as direct subelements of did, right? And I - theoretically - could also have these repeated without grouping them in physdescset, although that surely would be recommended via the Tag Library, correct?

— Reply to this email directly or view it on GitHub https://github.com/SAA-SDT/EAD-Revision/issues/450#issuecomment-54992513 .

http://eadiva.com | http://ruthtillman.com/ | @ruthbrarian

rockivist commented 10 years ago

@kerstarno Yes, both <physdescstructured> and <physdesc> would remain as optional and repeatable children of <did>. The changes I outlined would only modify <parallelphysdescset>.

kerstarno-zz commented 10 years ago

@rockivist and @ruthtillman - thanks for the confirmation.

tcatapano commented 10 years ago

I've attempted to model the example(s) given in DACS 2.5.8 in the file samples/physdescset

https://github.com/SAA-SDT/EAD-Revision/blob/parallelphysdescset/samples/physdescset.xml

I could easily be misunderstanding things -- especially the what exactly "parallel" and "coverage" refer to -- but it looks to me that physdescset is unnecessary to express parallelism in many cases (e.g. when each physdescstruct in a did has coverage="whole") and that in some cases (when there are multiple physdescsets in a did) it is necessary to nest physdescset in order to express that 1 or more statements in a parallel physdescset are together parallel to another statement.

tcatapano commented 10 years ago

sorry, the examples are from DACS 2.5.7

kshepher commented 10 years ago

I feel like I am way down in the weeds now, but is it not technologically possible to prohibit multiple repeated physdesc and/or physdescstructured elements directly within the did? We can tell people not to in the Tag Library all we want, but if it's possible it will happen. One of our goals had been to stop having multiple ways for people to do the same thing, was it not?

I'm sure there's a good reason that this needs to be this way and I just need to be reminded of it...

rockivist commented 10 years ago

@tcatapano Thanks for encoding all of the DACS extents. You are right that when @coverage="whole" the set is unnecessary to convey parallelism. It's only necessary when @coverage="part".

I'm reluctant to allow <physdescset> to recurse. A layer too far in terms of complexity, and only necessary when two or more part statements are parallel to one or more part statements (as on lines 159-173 of your example). So we can come up with a hypothetical reason why it might be necessary - that doesn't justify the added complexity especially given the relative obscurity of the application. That's why I agreed with the suggestion to drop the "parallel" from "parallelphysdescset". A generic non-recursing set option can cover most parallel applications, and doesn't presume that a parallel set is the only reason why you would want a set.

So: do we need a physdescset? Yes, I think so, at least for parallel part statements, but possibly for other applications I won't presume to predict. Do we need to allow it to recurse? I think that's adding one layer too many, in spite of the obscure use case we can come up with.

rockivist commented 10 years ago

@kshepher I'm not sure what part of the thread you are responding to, but there are a few reasons why physdescstructured / physdesc / and physdescset must be repeatable. First is our requirement that elements be repeatable to support multilingual description. Second, the model for physdescstructured accomodates multiple types of physdesc statements - spaceoccupied, carrier, or materialtype - so we need to allow more than one.

kshepher commented 10 years ago

@rockivist, I'm responding to the point @kerstarno made:

"Checking with Terry's implementation in branch, point 4 would however still allow for having physdesc and/or physdescstructured as direct subelements of did, right? And I - theoretically - could also have these repeated without grouping them in physdescset, although that surely would be recommended via the Tag Library, correct?"

My concern was being able to repeat physdescstructured and physdesc either within the did or within physdescset. Your point about multilingual finding aids explains why it's necessary to make them repeatable in the did, even if it does open up multiple ways of doing the same thing. I will make sure the Tag Library reflects the rationale.

kriskiesling commented 10 years ago

A couple of the examples trouble me. I think combining <physdescstructured> and <physdesc> as in lines 48-52 muddies the distinction between using one physical description element or the other and will add yet another layer of complexity on the stylesheet to render <physdesc> as a stand-alone physical description as well as accommodating it as an add-on to <physdescstructured>. I would much rather see (and I can't quite believe I'm saying this) a <physdescstructurednote> (or something similar) added to <physdescstructured> to deal with this type of physical description statement.

Also, the examples on lines 117-125 and in 194-204. Aren't computer files a material type? Given that we all will be or already are describing computer files, I'd hate to see them consigned to an 'otherphysdescstructuredtype' attribute and a 'generic-unit', which is meaningless. It seems to me the relationship is the same as boxes to linear feet.

I think we get into trouble with the <physdescset> examples. The diary one particularly troubles me (lines 185-193). Is the diary a 'carrier' or a 'materialtype'? Neither the diary nor the pages are a 'part'. The diary is the whole and the pages are also the whole This is an area where decades of bad practice and DACS let us down. A far clearer statement would be "1 diary consisting of 352 pages." A <physdesc> with such a straightforward statement would be a far superior way of encoding this example and making it intelligible to users. Why do we need to parse this out with lots of encoding?

Will there be other possible values for @localtype besides 'parallel'? Or does one simply not use the attribute if a parallel statement is not being made? If we're going to have parallel statements of extent, there needs to be a clear whole/part relationship.

MicheleCombs commented 10 years ago

@kriskiesling None of the element names came through in your post so I'm not sure I get all of what your saying, but I totally agree with this: "such a straightforward statement would be a far superior way of encoding this example and making it intelligible to users. Why do we need to parse this out with lots of encoding?"

tcatapano commented 10 years ago

The examples are mainly to demonstrate the encoding structures, please don't get sidetracked too by the physdescstructuredtype attribute values. I was trying my best and supplying values so that the samples would validate. These would be important to get right in published examples, and it is certainly an issue that variance interpretations and usage will impede machine processing.

With regard to the use of both physdescstruct and physdesc in the same <did>, this markup can be interpreted as two statements about the parent component:

it is in three boxes

it includes photographs and audiocassettes

and the statement "it is in three boxes" can be parsed out so that it also enables arithmetic operations like, for example counting the total number of boxes (i.e., c's with child physdescstruct elements with unittype " 'boxes' or 'box' ")... in all described components, or simply entry into database fields names. say, "count" and "unit type" in a database... And that's pretty much it, if anyone would like the markup to convey more or less information, please make a feature request

The example of the diary statement inside of the phsdescset points out a problem I had as well. Does the coverage apply to the thing being described in the current physdescstructset or the current component? In this case the markup does seem wrong and we should probably clarify the scope that the coverage attribute in documentation to be of the current parent unit.

Finally, I also agree that localtype is too broad. I would recommend simply a parallel= yes | no attribute.

kriskiesling commented 10 years ago

Hi Michelle,

Sorry about the elements. I was advocating a simple statement in

for the diary example, i.e. 1 diary consisting of 352 pages, which is not, of course, what the DACS example says. K On Wed, Sep 10, 2014 at 10:22 AM, MicheleCombs notifications@github.com wrote: > @kriskiesling https://github.com/kriskiesling None of the element names > came through in your post so I'm not sure I get all of what your saying, > but I totally agree with this: "such a straightforward statement would be a > far superior way of encoding this example and making it intelligible to > users. Why do we need to parse this out with lots of encoding?" > > — > Reply to this email directly or view it on GitHub > https://github.com/SAA-SDT/EAD-Revision/issues/450#issuecomment-55131666 > . ## Kris Kiesling Elmer L. Andersen Director of Archives and Special Collections 305 Andersen Library University of Minnesota 222 21st Ave. South Minneapolis, MN 55455 voice: 612-626-5776 fax: 612-625-5525
tcatapano commented 10 years ago

@rockivist : happy to not permit the recursion of <physdescset> if the use case is deemed outré

tcatapano commented 10 years ago

@rockivist: can I push to develop and close the issue?

tcatapano commented 10 years ago

I've added a file demonstrating possible encodings of DACS 2.5.6-8 using the current version of the schema in the develop branch.

https://github.com/SAA-SDT/EAD-Revision/blob/parallelphysdescset/samples/parallelphysdescset.xml

I'm not convinced that there is a need to add anything to the current schema to resolve ambiguities of relationships between <physdescstructured> and <parallelphysdescset>. My thinking is tediously detailed at:

https://www.evernote.com/shard/s211/sh/5f259a52-cbee-49b2-bf43-ccef08a0facd/96a045be14e92795abc85b6e7ffacd81

rockivist commented 10 years ago

There is more to say later, but first a few clarifications:

@kriskiesling <physdescstructured> does have an optional <descriptivenote>, so there would be no need for a <physdescstructednote> - it's already in the schema. We should be sure to account for that in our Tag Library examples.

Also: in all cases @localtype will be undefined and have no ennumerated values.

More soon.

rockivist commented 9 years ago

Superseded by #454. Closing