MarcusBarnes / mik

The Move to Islandora Kit is an extensible PHP command-line tool for converting source content and metadata into packages suitable for importing into Islandora (or other digital repository and preservations systems).
GNU General Public License v3.0
34 stars 10 forks source link

Make FilterModsTopic conform to DLF Aquifer #253

Closed bondjimbond closed 7 years ago

bondjimbond commented 8 years ago

FilterModsTopic (according to the documentation) places multiple Topics into a single Subject container. According to DLF Aquifer guidelines, separate Subject elements/containers should be used for each chain of related subjects - see https://www.loc.gov/standards/mods/userguide/subject.html i.e. each Topic element should be inside its own Subject element.

The example output from the documentaiton does not conform to the guidelines:

<subject>
  <topic>People</topic>
  <topic>Bodies of water</topic>
  <topic>Beaches</topic>
</subject>

Should instead be:

<subject>
  <topic>People</topic>
</subject>
<subject>
  <topic>Bodies of water</topic>
</subject>
<subject>
  <topic>Beaches</topic>
</subject>

I can confirm that Islandora Solr metadata display processes the latter version nicely, while the former version generates messy output. Additionally, the latter conforms to standards, and is better for sharing, filters, etc.

MarcusBarnes commented 8 years ago

@bondjimbond Thank you. To confirm, the following snippet is displayed nicely in Islandora Solr metadata display?

<subject>
  <topic>People</topic>
</subject>
<subject>
  <topic>Bodies of water</topic>
</subject>
<subject>
  <topic>Beaches</topic>
</subject>

The task then is to provide a way to generate the snippet above when generating packages from source metadata where using the FilterModsTopic metadatamanipulator makes sense.

bondjimbond commented 8 years ago

Yes, the snipped quoted displays best in all places (search results, Solr metadata display, etc). In Arca:

(1) Here's a display result with all topics under one element: http://arcabc.ca/islandora/object/unbc%3A7310

(2) Here's a display result with each topic under its own : http://arcabc.ca/islandora/object/unbc%3A46

With dc.subject as our search result display field, here's what (1) shows:

Energy crops -- Economic aspects -- British Columbia -- Quesnel Region.--Energy crops -- Social aspects -- British Columbia -- Quesnel Region.--Agroforestry -- British Columbia -- Quesnel Region.--------------, Energy crops -- Economic aspects -- British Columbia -- Quesnel Region.--Energy crops -- Social aspects -- British Columbia -- Quesnel Region.--Agroforestry -- British Columbia -- Quesnel Region.------------------, SB288.3.C2 K67 2013, Odd problems, considering how different the actual MODS looks.

And here's how (2) displays:

ANOVA, classroom demonstration, active learning, statistics, Students, Psychology, Variance analysis

Default should definitely be to break out each topic into its own element.

mjordan commented 8 years ago

@bondjimbond can you add the following to your .ini file's [METADATA_PARSER] section and try again:

repeatable_wrapper_elements[] = subject

That should result in output where each <topic> is wrapped in its own <subject> element. MIK's default behavior is to collapse child elements of the same wrapper element so that they all end up within one wrapper. Adding the wrapper element's name to the repeatable_wrapper_elements[] list overrides this behavior. I believe @MarcusBarnes wrote the code that does this so he may want to confirm.

mjordan commented 8 years ago

Also point out that it is possible to make <topics> that share a <subject> wrapper display on their own using a custom Solr metadata display. The MODS for http://digital.lib.sfu.ca/bcp-5831/blueberry-farm-near-mission-bc-1906 looks like

<subject>
  <topic>Views</topic>
  <topic>Croplands</topic>
  <topic>Dwellings</topic>
  <topic>Tents</topic>
</subject>

I can probably get more information if you're interested. @librarychik did the configuration on this. Not sure if there's a way to do this in search results however.

bondjimbond commented 8 years ago

Re your last post, Mark - fixing the display doesn't fix the core problem of not adhering to standards, so I'd prefer to make sure the metadata going in is right instead.

Re the .ini file bit: sounds good, I'll do that. We'll see how it works out.

mjordan commented 8 years ago

@bondjimbond just doing some catch up on the MIK issue queue and am wondering what your take on this one is at this point.

bondjimbond commented 8 years ago

Haven't got to it yet; busy with some other stuff at the moment. Configuring MIK will become a priority late this week I think.

bondjimbond commented 7 years ago

Since SplitRepeatedValues supersedes FilterModsTopic, and we've worked out the documentation problem, I think this one can be closed.