EDIorg / EMLassemblyline

R package for creating EML metadata
https://ediorg.github.io/EMLassemblyline/
MIT License
28 stars 13 forks source link

EAL not looking at EML 2.2 list of units, evidently #72

Closed vanderbi closed 4 years ago

vanderbi commented 4 years ago

A workshop participant could not get EAL to run successfully with the unit milligramPerLiter in her attributes table, although this is the unit that shows up in https://sbclter.msi.ucsb.edu/external/InformationManagement/EDI/units/EML_units_preferred.html. When she changed the unit to milligramsPerLiter, EAL ran fine.

srearl commented 4 years ago

That seems to be problem with that particular sbclter resource as milligramsPerLiter is the correct form in the EML schema: https://github.com/NCEAS/eml/blob/master/xsd/eml-unitTypeDefinitions.xsd#L354

mobb commented 4 years ago

The schema also allows this:

     <xs:enumeration value="milligramPerLiter"/>

the reason that both are allowed is for backward compatibility. the list I posted at sbclter.msi is the list of units that are not deprecated. deprecated units are still allowed (so must be in the enumeration), but we can use the deprecatedInFavorOf attribute to focus on the cleaner set, which is what the sbclter.msi list does.

srearl commented 4 years ago

Ah, yeah, and just a few lines above line 354 no less. Sorry for adding confusion.

mobb commented 4 years ago

no problem! keeps us on our toes! I still need to confirm that milligramPerLiter is indeed NOT in 2.1

clnsmth commented 4 years ago

EAL is validating the units against: eml-unitDictionary.xml (EML 2.1.0) eml-unitDictionary.xml (EML 2.2.0)

milligramPerLiter is absent from both lists.

Since EAL is dependent on the EML and emld R packages I suggest EAL users stick with the list of units output from EMLassemblyline::view_unit_dictionary() or define them as custom.

If milligramPerLiter is valid, then we may want to notify EML and emld package maintainers.

mobb commented 4 years ago

I will check into that. milligramPerLiter is valid. It looks to me like R Open Science has an out of date list for 2.2. (way out of date, in fact)

See: https://github.com/NCEAS/eml/blob/master/eml-unitDictionary.xml

Second: recommend that EAL do two things (assuming it creates EML 2.2 by default):

  1. drop validation against EML 2.1.x list
  2. from EML 2.2 list, only suggest units that are NOT designated deprecatedInFavorOf. the list will be much smaller, and much less confusing.
mobb commented 4 years ago

Yes, the R EML package has an out-of-date unitDictionary. There is already an issue logged for it.

clnsmth commented 4 years ago

This issue will be resolved in the emld dependency (see https://github.com/ropensci/emld/issues/56).