adiwg / mdTranslator

Metadata translation tool built using Ruby
https://www.adiwg.org/mdTranslator/
The Unlicense
14 stars 12 forks source link

Issue with YearMonth format date translation from CSDGM to mdJSON #197

Closed dwalt closed 6 years ago

dwalt commented 6 years ago

There appears to be an issue with date translation of YearMonth format in CSDGM to mdJSON as reported by USGS NGP. The following is the report from Peg:

The publication date in my FGDC file is 200509. In the mdJSON it is 2005-09-01, with all the time values set to 0 When I load it to mdEditor, the date is 2005-08-31. So there is a problem in two places - the translator added the day, and then mdEditor changed the month and day.

Farther down in my FGDC file, I have the time period of content information, which is just a single publication date of 200509. In the mdJSON this shows up as an endDateTime of 2005-09-01 with all time values set to 0. Once read into mdEditor, the date is 2005-08-31, time of 20:00:00. I think when there is just a single date in FGDC, it should be translated to timeInstant in ISO, rather than a time period. I don't see a way to do this in mdEditor.

stansmith907 commented 6 years ago

Yeah, I not really surprised to see this problem surface. FGDC and to a lesser extent ISO treat dateTime fields as text. When represented in our software they are converted internally to dateTime dataTypes. By necessity these always have year, month, day, hour, minute, second, and fraction. So parts which were not specified in the FGDC or ISO input must be set by software. The general convention is to first-of-year, first-of-month, etc. I added a dateResolution attribute to the internal object to help trim output to same resolution as was entered. I'll check to see if this is working properly for the conditions you specified.

The bigger issue is timezone. i.e. -- Is the dateTime value expressed in UTC or a local timezone? And what timezone? And was that daylight savings or not? FGDC does not collect that information except for a single field that sets whether time information was expressed in local or universal time. But the local timezone is still a mystery. And most often this field is not even set. ISO only collects this information if the user specified date-times with the full UTC offest (e.g. 2018-06-11T13:45:00-09:00).

Some software, such as servers and calendar date/time controls used by mdEditor, assume dateTime is entered in local and then stores the value in UTC. Date/times entered in mdEditor in Anchorage look good on your screen, but the saved values may be different. The problem arises when the next software step does not realize the time is UTC and handles as local or visa versa. Also the correct local timezone was probably lost as well.

Problem 1: If 2005-09-01 is assumed to be UTC it would translate to 2005-08-31T15:00:00 AKD. So the original intent was lost. Problem 2: Since timezone is not tracked by mdJson, the computation of which local timezone to use is made by the workstation or server which may not be sitting in the same timezone as you are. Problem 3: Dates/times entered in mdEditor in Anchorage my be different when viewed in mdEditor on the East coast.

As you can see this is going to take some broader discussion to resolve. Sounds like a topic for one of our regular Tuesday meetings.

jlblcc commented 6 years ago

Yeah, not sure how we'll handle this in the adiwg/mdEditor . Might have to build a custom date/time editor or allow the option to switch it off and use text-based input with validation. As stated, software doesn't like dates formatted like 2018 or 2018-07.

dwalt commented 6 years ago

Perhaps dates shouldn't be converted to date/time format, instead treated as a formatted string that is parsed into relevant components and reformatted when written.

On Mon, Jun 11, 2018 at 3:34 PM, Josh Bradley notifications@github.com wrote:

Yeah, not sure how we'll handle this in the mdEditor. Might have to build a custom date/time editor or allow the option to switch it off and use text-based input with validation. As stated, software doesn't like dates formatted like 2018 or 2018-07.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/adiwg/mdTranslator/issues/197#issuecomment-396419119, or mute the thread https://github.com/notifications/unsubscribe-auth/AF6hL7QUofA2smBh6YKI2QYFyqepSEU-ks5t7v5ugaJpZM4UhBal .

stansmith907 commented 6 years ago

I have at least patched part of this problem. I removed the adding of a T00:00:00+00:00 time portion to dateTime objects when time was not specified in the FGDC input. This should repair the notation in FGDC to mdJson conversion. However, timezone issues may still present.

Note: All time output from FGDC to mdJson is defaulted to UTC for match reading in mdEditor. More investigation is warranted.

stansmith907 commented 6 years ago

I was also thinking the only way out was to treat dates as string. However that means dropping calendar/clock controls in favor of pickers for year, month, day, hour, minute, second, and offset separately. It would also mean a significant modification to mdJson 2.x. But ...

dwalt commented 6 years ago

Could do a conversion at initial read and upon write with mdEdit.

On Mon, Jun 11, 2018 at 4:04 PM, stansmith907 notifications@github.com wrote:

I was also thinking the only way our was to treat dates as string. However that means dropping calendar/clock controls in favor of pickers for year, month, day, hour, minute, second, and offset separately. It would also mean a significant modification to mdJson 2.x. But ...

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/adiwg/mdTranslator/issues/197#issuecomment-396423900, or mute the thread https://github.com/notifications/unsubscribe-auth/AF6hL8tRZ6rQp3DFyQeWB2sxO3OoQ8eCks5t7wWJgaJpZM4UhBal .