adiwg / mdTranslator

Metadata translation tool built using Ruby
https://www.adiwg.org/mdTranslator/
The Unlicense
14 stars 12 forks source link

Taxonomy incorrectly inserted in FGDC to ISO -2 translation #213

Open dwalt opened 5 years ago

dwalt commented 5 years ago

A direct FGDC CSDGM xml translation to ISO -2 is showing taxonomy artifacts that are not in the FGDC record. No taxonomy information exists in the FGDC record. The -2 file shows a md:MD_MetadataExtensionInformation section with boiler plate taxonomy elements.

wsr_nhdpv2.1.zip

stansmith907 commented 5 years ago

The FGDC biological profile extends sections of the metadata in addition to taxonomy. I leave the extension information in the output primarily because we use the biological profile version of 19115-2, although the extension does primarily highlight taxonomy. Is it causing trouble? If so I can place it conditionally.

dwalt commented 5 years ago

It seems to throw off those that read the XML, me included. It also leaves some doubt whether this information will be publicly exposed by a downstream catalog or other service, implying the metadata subscribes to the Biological profile. I think it avoids these questions if the extension doesn't exist at all unless biological content exists. I think this could be applied to other extensions we have or will define. With any extension there is likely a required element that can be used as a decision point for whether extended content exists and therefore the requisite extension be applied.

On Wed, Aug 7, 2019 at 2:26 PM stansmith907 notifications@github.com wrote:

The FGDC biological profile extends sections of the metadata in addition to taxonomy. I leave the extension information in the output primarily because we use the biological profile version of 19115-2, although the extension does primarily highlight taxonomy. Is it causing trouble? If so I can place it conditionally.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/adiwg/mdTranslator/issues/213?email_source=notifications&email_token=ABPKCLYHVCWIWP5FLUI5AG3QDNDYDA5CNFSM4IJZAED2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD3Z4CHA#issuecomment-519291164, or mute the thread https://github.com/notifications/unsubscribe-auth/ABPKCL4V2WTVRDWRCVCUMTDQDNDYDANCNFSM4IJZAEDQ .

stansmith907 commented 5 years ago

Just to be clear, the biological profile made changes to Spacial_Domain, Analytical_Tools, Lineage, Digital_Transfer_Information, DateTime, and Time_Period in addition to adding the Taxonomy section. The extension is the only means I know of to alert metadata intake programs that the biological profile is being used. Not knowing this could effect some XML readers, but not sure how likely this would be. If we are only confusing human readers, how much of a concern is that?

jlblcc commented 5 years ago

I'd suggest leaving the extension info in place and explaining it with a comment. I doubt very many users will actually take the time to read XML.

jlblcc commented 5 years ago

Given all of the sections affected, I would prefer to leave the extension in place and not rely on conditional insertion. I think that adds an unnecessary complication.

dwalt commented 5 years ago

So what about funding? Should we not put an extension in for that? Are we going to put in every xml record, extension declarations for everything we might extend just in case it is used? What is special about the biological extension?

Exactly, we don't know how records will be used downstream once published. I guess my point is why put superfluous information into a record that doesn't belong there, has nothing to do with the content, and that implies content or content conformance that doesn't exist in the record?

On Wed, Aug 7, 2019 at 3:07 PM Josh Bradley notifications@github.com wrote:

Given all of the sections affected, I would prefer to leave the extension in place and not rely on conditional insertion. I think that adds an unnecessary complication.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/adiwg/mdTranslator/issues/213?email_source=notifications&email_token=ABPKCL6LIU6ROSBLAEGKUBTQDNIRJA5CNFSM4IJZAED2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD3Z6KGA#issuecomment-519300376, or mute the thread https://github.com/notifications/unsubscribe-auth/ABPKCL7ZZ57LKI3FRIU4L43QDNIRJANCNFSM4IJZAEDQ .

jlblcc commented 5 years ago

Funding is not translated to ISO. The extensions only apply to ISO 19115 output. I should think it's fairly standard practice to include the extension if you're outputting to the biological profile as SOP, regardless of whether that particular metadata file contains information in the extended classes.

For reference: ftp://ftp.ncddc.noaa.gov/pub/Metadata/Online_ISO_Training/Intro_to_ISO/schemas/ISObio/

dwalt commented 5 years ago

But it is not SOP for geology data, hydrology data, etc. So are you trying to say it should be in the standard, but isn't so its "extended" and the ISO output is supported by the extension. If somebody wrote another extension we choose to except that extension or not in our translations? All this I guess is irregardless of content, but identifying what "standard" the record is supported under?

On Wed, Aug 7, 2019 at 4:15 PM Josh Bradley notifications@github.com wrote:

Funding is not translated to ISO. The extensions only apply to ISO 19115 output. I should think it's fairly standard practice to include the extension if you're outputting to the biological profile as SOP, regardless of whether that particular metadata file contains information in the extended classes.

For reference: ftp://ftp.ncddc.noaa.gov/pub/Metadata/Online_ISO_Training/Intro_to_ISO/schemas/ISObio/

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/adiwg/mdTranslator/issues/213?email_source=notifications&email_token=ABPKCL7QIYDA5AFZ53TWYKTQDNQQNA5CNFSM4IJZAED2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD32BSYY#issuecomment-519313763, or mute the thread https://github.com/notifications/unsubscribe-auth/ABPKCLZD53ZPGAZQLAYUB4TQDNQQNANCNFSM4IJZAEDQ .