adiwg / mdTranslator

Metadata translation tool built using Ruby
https://www.adiwg.org/mdTranslator/
The Unlicense
14 stars 12 forks source link

Errors Translating FGDC to HTML #201

Closed dwalt closed 6 years ago

dwalt commented 6 years ago

In testing a translation from FGDC to HTML, noticed some errors with translating citations. In this example, the larger work citation from FGDC resulted in two identical larger work citations in Associated Resources. I believe the first one was correctly translated from the FGDC larger work citation. The second one came from a redundant larger work citation from the FGDC record within the cross reference section, discussed below.

In the FGDC file there are two cross reference citations, the second one has a larger work citation. This scenario is rarely used, but is valid in FGDC. The translator appears to have translated the larger work citation as a larger work citation in Associated Resources, which mirrors the first entry. Then it created the second cross reference citation as a cross reference associated resource. In the title it appended the cross reference title to the end of that citation's larger work citation title.

The first cross reference citation in the FGDC file was not translated.

Attached source file with a txt file type sufffix. GulkanaGlacier_RawGPR.txt

stansmith907 commented 6 years ago

Checking the input file GulkanaGlacier_RawGPR.xml I found two largerWorkCitation tags: one in metadata > idinfo > citation > citeinfo; and another in metadata > idinfo > crossref[2] > citeInfo as you stated. They appear to be identical in the XML. This resulted in 2 associated resources which are identical. All FGDC larger work citations are mapped to associated resources. I don't see any difference between the two instances when I translated to HTML using mdTranslator version 2.13.2.

What am I missing?

largerworkcitation

dwalt commented 6 years ago

The larger work citations in Associated Resources is fine. Look at the third Resource record labeled as crossReference.

In the XML there is cross ref #1: U.S. Geological Survey Glaciers and Climate Project. Then cross ref #2: Raw Ground Penetrating Radar Data on North American Glaciers with a larger work citation #2a: Ground Penetrating Radar Data on North American Glaciers.

If you look at the report, it appended cross ref #2 title to cross ref #1 title. Likewise the authors are merged as are the online resources. It merged two cross ref records into one. Plus the relationship of cross ref #2 to it's larger work citation #2a has been lost.

stansmith907 commented 6 years ago

Thanks. The problem was mdTranslator was processing cross references as a scalar object when in fact it is an array. The way XPATH works, it then places all occurrences in the same object. Fixed now. Except that, in mdJSON I have no way to preserve the relationship for a larger work citation that has it's own cross reference. Both just become independent associated resources.

crossref2

Let me know if this works for you. If so I'll close the issue.

dwalt commented 6 years ago

Correct, crossref is an array in FGDC. This looks right now. In order to preserve the larger work citation relationship, associated resource would need to be recursive, something I'm not sure ISO supports. While FGDC allows this referencing, I personally don't think it is a good practice as it goes beyond direct relevance to the resource described. We can call it good for now, acknowledge it as an FGDC mapping issue and see it any actual requirement for it pops up.

On Thu, Sep 6, 2018 at 10:10 AM stansmith907 notifications@github.com wrote:

Thanks. The problem was mdTranslator was processing cross references as a scalar object when in fact it is an array. The way XPATH works, it then places all occurrences in the same object. Fixed now. Except that, in mdJSON I have no way to preserve the relationship for a larger work citation that has it's own cross reference. Both just become independent associated resources.

[image: crossref2] https://user-images.githubusercontent.com/4998910/45176562-d9bfe500-b1bc-11e8-853f-b432d33813c1.png

Let me know if this works for you. If so I'll close the issue.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/adiwg/mdTranslator/issues/201#issuecomment-419189986, or mute the thread https://github.com/notifications/unsubscribe-auth/AF6hLzE81yh07b4eUuWoiS1gEgrCNYC7ks5uYWTpgaJpZM4WGX-S .

stansmith907 commented 6 years ago

Thanks, I'll close the issue and publish the update.