ksclarke / solr-iso639-filter

A Solr filter that converts ISO-639-1 and ISO-639-2 codes into text so you can have a human-friendly facet.
http://projects.freelibrary.info/solr-iso639-filter
Apache License 2.0
5 stars 1 forks source link

"null" output when ISO-639 input code is blank? #5

Closed ksclarke closed 11 years ago

ksclarke commented 11 years ago

See Nick's screenshot: http://i.imgur.com/VkdtDkI.png

Instead of null, output empty string? a default value? something else? Can you not output from a filter? That's probably the best path.

ksclarke commented 11 years ago

@ruebot I'm not having much luck getting this value. Could you check one of the records it appears on and let me know what the code is (or isn't) in the record?

ruebot commented 11 years ago

It is actually the weird not a "bug" with XML forms. Those nulls come from tag fields setup for language that have a few language entities plus An accidental blank entry because I might not have enabled the cleanup xslt. Make sense?

ksclarke commented 11 years ago

I'm not sure I understand. It's coming from the XSLT? Or it has multiple language code entries?

ruebot commented 11 years ago

screenshot from 2013-08-27 09 40 20

See how there is two language entries, there is technically three because if you leave that field blank, it creates a null entry. Like so:

  <language>
    <languageTerm authority="iso639-2b" type="code"/>
    <languageTerm authority="iso639-2b" type="code">eng</languageTerm>
    <languageTerm authority="iso639-2b" type="code">ghc</languageTerm>
  </language>
ksclarke commented 11 years ago

From Nick, from the document with the null displaying:

<arr name="mods_language_languageTerm_code_ms">
  <str>eng</str>
  <str>ghc</str>
</arr>
<arr name="mods_language_languageTerm_code_mt">
  <str>eng</str>
  <str>ghc</str>
</arr>
<arr name="iso639">
  <str>eng</str>
  <str>ghc</str>
</arr>

And:

<copyField source="mods_language_languageTerm_code_ms" dest="iso639"/>
<field name="iso639" type="iso639Code" indexed="true" multiValued="true" />
ruebot commented 11 years ago

Here is a MODS record with the 'null' value in it:

<?xml version="1.0" encoding="UTF-8"?>
<mods xmlns="http://www.loc.gov/mods/v3" xmlns:mods="http://www.loc.gov/mods/v3" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xlink="http://www.w3.org/1999/xlink">
  <titleInfo>
    <title>Concert: Boys of the Lough</title>
  </titleInfo>
  <name type="personal">
    <namePart>Boys of the Lough</namePart>
    <role>
      <roleTerm authority="marcrelator" type="text">Host</roleTerm>
    </role>
  </name>
  <name type="personal">
    <namePart>Aly Bain</namePart>
    <role>
      <roleTerm authority="marcrelator" type="text">Performer</roleTerm>
    </role>
  </name>
  <name type="personal">
    <namePart>Cathal McConnell</namePart>
    <role>
      <roleTerm authority="marcrelator" type="text">Performer</roleTerm>
    </role>
  </name>
  <name type="personal">
    <namePart>Dave Richardson</namePart>
    <role>
      <roleTerm authority="marcrelator" type="text">Performer</roleTerm>
    </role>
  </name>
  <name type="personal">
    <namePart>Robin Morton</namePart>
    <role>
      <roleTerm authority="marcrelator" type="text">Performer</roleTerm>
    </role>
  </name>
  <typeOfResource>Sound Recording</typeOfResource>
  <tableOfContents>The Boys of the Lough / (Boys of the Lough). --  Slanty Gart, aka MacDonald's Reel / (Boys of the Lough). -- The Laird of Drumblair / (Boys of the Lough). -- Instrumental / (Boys of the Lough). -- For He Is Such a Bonny Lad / (Boys of the Lough). -- Salmon Tails Up The Water / (Boys of the Lough). -- The Devil And The Bailiff / (Boys of the Lough). -- The Winnie Hills of Lietrim / (Boys of the Lough). -- Ryan's Slip Jig / (Boys of the Lough). -- Jack Broke The Prison Door (Boys of the Lough). -- Instrumental / (Boys of the Lough). -- Twisting of The Rope, aka Casadh an t'Sugain (Boys of the Lough). -- Instrumental / (Boys of the Lough).</tableOfContents>
  <genre>folk music</genre>
  <originInfo>
    <dateIssued>1975-06-20</dateIssued>
    <publisher>Mariposa Folk Foundation</publisher>
    <place>
      <placeTerm authority="marccountry">Canada</placeTerm>
    </place>
    <place>
      <placeTerm type="text">Toronto Islands</placeTerm>
    </place>
    <place>
      <placeTerm authority="marccountry">Canada</placeTerm>
    </place>
  </originInfo>
  <language>
    <languageTerm authority="iso639-2b" type="code"/>
    <languageTerm authority="iso639-2b" type="code">eng</languageTerm>
    <languageTerm authority="iso639-2b" type="code">ghc</languageTerm>
  </language>
  <abstract>Consists of concert performance by Boys of the Lough of traditional Irish, Scottish (specifically Shetland and Northumbrian music.) Songs include: "The Boys of the Lough", "Slanty Gart" aka MacDonald's Reel, "The Laird of Drumblair" (strathspey), Northumbrian pipe tunes (?), "For He Is Such A Bonny Lad" and "Salmon Tails Up The Water", "The Devil And The Bailiff" (slip jig), "The Winnie Hills Of Leitrim" (slip jig), "Ryan's Slip Jig" "Jack Broke The Prison Door" (reel) a couple of whistle tunes, "Twisting of the Rope" aka "Casadh an t'Sugáin" and an untitled mazurka.</abstract>
  <identifier>ASC06562</identifier>
  <physicalDescription>
    <form>1/4" reel audio tape</form>
    <extent>0:47:06</extent>
  </physicalDescription>
  <relatedItem>
    <titleInfo>
      <title>Mariposa Folk Foundation, F0511</title>
    </titleInfo>
    <location>
      <location>
        <url note="Finding Aid">http://archivesfa.library.yorku.ca/fonds/ON00370-f0000511.htm</url>
      </location>
      <location>
        <url note="Finding Aid">http://archivesfa.library.yorku.ca/fonds/ON00370-f0000511.htm</url>
      </location>
    </location>
  </relatedItem>
  <subject>
    <topic>Mariposa Folk Festival</topic>
    <topic>jig</topic>
    <topic>reel</topic>
    <topic>mazurka</topic>
    <topic>fiddle</topic>
    <topic>flute</topic>
    <topic>mandolin</topic>
    <topic>guitar</topic>
    <topic>concertina</topic>
    <topic>folk songs --Scottish</topic>
    <topic>folk songs --Irish</topic>
    <geographic>Area 3</geographic>
    <temporal>12:30-13:30</temporal>
    <hierarchicalGeographic>
      <continent>North America</continent>
      <country>Canada</country>
      <province>Ontario</province>
      <city>Toronto</city>
      <citySection>Toronto Islands</citySection>
    </hierarchicalGeographic>
    <cartographics>
      <coordinates>43.620833, -79.378611</coordinates>
    </cartographics>
  </subject>
  <accessCondition type="useAndReproduction">Digital copy created for preservation and access purposes.  Available by request for research purposes only.  All other uses must be cleared by patron through copyright holders.  For further details contact ascproj@yorku.ca</accessCondition>
  <accessCondition type="restrictionOnAccess"/>
</mods>
ksclarke commented 11 years ago

This is the mods xslt: https://github.com/yorkulibraries/basic-solr-config/blob/modular/islandora_transforms/slurp_all_MODS_to_solr.xslt

ksclarke commented 11 years ago

Okay, problem solved. The empty heading was a red herring; those are handled fine. The problem was there are sometimes two different three digit codes for the same place and I was only loading one from the source file. When a secondary one was passed in to be converted, a null would be returned. I've made sure, now, that the secondary codes are loaded and that the passing in of a three or two digit code that isn't found doesn't return a null. It returns the code that was passed in now if it can't find a match.

ruebot commented 11 years ago

@ksclarke++

I'll pull this tomorrow before I leave, and re-index again over the weekend.

Thank again man, this is excellent!

ksclarke commented 11 years ago

@ruebot Just a note that I tweaked the way versioning will work again, so you'll want to make sure you remove your old jar before adding the new one (because the new one won't overwrite the old one with a different file name).

ruebot commented 11 years ago

Looks like I still have nulls.

Screenshot

ksclarke commented 11 years ago

Are you sure you don't have an old jar on the classpath still? Sorry about all the version changing. Catch up with me in IRC and we'll track it down.

ruebot commented 11 years ago

facepalm I didn't delete the old one from target before after I removed the old ones and copied over the "new" one. Let me actually do this right this time. Sorry about that :smile:

ksclarke commented 11 years ago

No problem. :-)

If the version hadn't changed it would have just overwritten the old jar.

ruebot commented 11 years ago

We're good to go. Excellent work again!

ksclarke commented 11 years ago

Great, thanks!