Closed seanaery closed 5 years ago
@seanaery are we tracking this concept of the top level not being a "collection" in any other tickets?
While I agree its not inline with EAD specification, a lot of the ArcLight design depends on this construct. So I'm curious about whether we have had design to indicate what other changes would be necessary to accommodate this.
@mejackreed Great question. Beside this ticket we do have a couple others logged now: #776 & #778 but there might be many more lurking. I'm getting a sense that there's a lot of logic built into ArcLight right now that assumes that the top-level is "collection", and pulling on the thread of that assumption might unravel quite a bit or introduce some major new complexities.
I'd like to explore this indexing approach and see what happens, since it could fix all these issues with a minimal amount of work:
level_ssm
field no matter what the actual @level
is. Keep the original real @level
value too for faceting. Maybe have to do something similar with level_ssi
.@mejackreed See #776 and #778, for example.
These are cases where the top-level (<archdesc/>
) level
attribute is set to recordgrp
("record group"). The level
attribute has no semantic constraints across top-level vs. component descriptions and is an enumerated list of values. By far the most common top-level values for this attribute are likely collection
, but it's true that any of these levels are possible. See, e.g., Bron, Proffitt, and Washburn (2013):
Sounds good 👍 .
@seanaery
Add "collection" to all the top-level level_ssm field no matter what the actual @level is. Keep the original real @level value too for faceting. Maybe have to do something similar with level_ssi.
Per #778 I'd be in favor of changing the logic to have a different way to specify what's the top-level. Does the change in our indexing strategy allow us to do that easily? For example, can we identify what is/is not a child document?
Thanks for this feedback @anarchivist -- those stats and that C4L article are really helpful. It's probably possible to do what you suggest. It would be easy to add a field in the indexing process that clearly distinguishes top-level vs. component. Though I think we need to investigate a bit more to get a clearer sense what it'd take to supplant all the logic in the app currently hinging on level
to use a different field instead. It could be fairly complicated, but I'm not sure.
One design challenge, regardless of whether we 1) index a top-level recordgrp
or fonds
, etc. as if it were a collection
; 2) replace any level
-based logic to use a new top-level / not
field; 3) do something else...
Is the label "Collection" still appropriate in the UI even if it stretches the definition beyond the archival definition of the term? E.g., would the "Group by Collection" button still use the term "Collection"? Would the "Collections" link in the primary nav "Repositories | Collections" still say "Collections"?
Is the label "Collection" still appropriate in the UI even if it stretches the definition beyond the archival definition of the term? E.g., would the "Group by Collection" button still use the term "Collection"? Would the "Collections" link in the primary nav "Repositories | Collections" still say "Collections"?
I think that's probably fine. My guess is that localization will allow the replacement of "Collections" with another word rather easily. If not - e.g., if this relates somehow to the logic changes, I'm confident that we can either reuse "Collections" as a string or find a reasonable replacement.
Example in ArcLight Demo: https://arclight-demo.projectblacklight.org/?utf8=%E2%9C%93&group=true&search_field=all_fields&q=wagaw
In this case, the EAD file has
<archdesc level="recordgrp">