ubermichael / isetools

Tools for parsing data for the Internet Shakespeare Editions
GNU General Public License v2.0
2 stars 3 forks source link

Feature: validator to check FOREIGN/@lang codes #12

Closed telic closed 2 years ago

telic commented 9 years ago

FOREIGN/@lang should be a valid two-letter ISO 639-1 or three-letter ISO 639-3 language code. Unrecognized codes should log a warning.

ubermichael commented 9 years ago

Is there an open-source friendly version of the language codes anywhere?

telic commented 9 years ago

The 639-3 codes can be downloaded in TSV from SIL, and this includes part-1 equivalents. Part-3 includes everything covered by part-1, so this one file should do it.

IMNAL, but their terms of use sound pretty wide open to me.

ubermichael commented 6 years ago

Hi @telic

Does this fix the validation? I cannot find any ISE SGML documents to check against. Have they been moved?

telic commented 6 years ago

Yes, that appears to do the trick.

All of the text files are in subversion at https://revision.hcmc.uvic.ca/svn/ise-developers/trunk/eXist/db/apps/iseapp/content/documents/iml/.

Did you intentionally comment out a plugin in pom.xml in the change above?