pantheracorp / PantheraIDS_Features

A repository for any feature requests related to PantheraIDS
0 stars 0 forks source link

DV-523 ⁃ Feature #476 ⁃Replace existing DateTimeOriginal in IDS and Redshift #476

Closed sync-by-unito[bot] closed 10 months ago

sync-by-unito[bot] commented 1 year ago

﹍Background:﹍

It has been brought to my attention (on the forum but also in emails from a few people now) that there is no attribute for server-based databases where the date and time is recorded as timeshift-adjusted, and in the local time zone.

The only place that the date and time is recorded with timeshifts and in the local time zone is in the FileName_New …. which isn’t very useful for analyses. The “DateTimeOriginal” is the only date-time attribute that is adjusted for timeshifts- HOWEVER once the data are put into the server this attribute is converted to GMT time (I’m assuming for consistency across all datasets across the world).

Therefore, there is no attribute that people can easily use for analyses… which is mind-blowing. I propose we create a new attribute called “FinalDateTime” within the Trigger Table. For existing data, we can try to autofill this using the FileName_New. For new data, this attribute should be created in the Process Images or Historical Importer modules when new databases are created.-

﹍STEP ONE: Investigation:﹍

Why is this happening?

It was determined that the tidyr package is converting the datetime in the DateTimeOriginal column to GMT/UTC time.

﹍STEP TWO: Fixing:﹍

The workaround we’ll use is to create a new DateTimeFinal attribute (and not touch the DateTimeOriginal attribute). The DateTimeFinal attribute will be in character/string format, so that we avoid the tidyr package converting the date/time to GMT when we change from wide to long format (tidyr::gather).

  1. Replacing areas where we use "DateTimeOriginal" for "DateTimeFinal" (Isaiah and myself have already started a list of these places) - Valentine
  2. Adding the new attribute into the codebase - Valentine
  3. 【 highlighted #006644 】Adjust process images and historical importer to include the DateTimeFinal attribute. - Tafadzwa (Done)【 end highlighted 】
  4. Creating a way for IDS to handle local data that does not yet have "DateTimeFinal" in the database - Valentine
  5. Adding the new attribute into the Redshift tables and filling it with the Date and Time from the FileName_New attribute (code already exists, just needs to be adjusted to consider that we are now making a new attribute) - Adjust the existing sql query to create the new DateTimeFinal attribute - Thabied
  6. 【 highlighted #006644 】Considering any unit tests that might break because of these changes - Tafadzwa【 end highlighted 】

┆Issue is synchronized with this Jira Task by Unito ┆Attachments: Add_New_Attribute.sql | DTO functions Check.txt | Plots Capture.png

sync-by-unito[bot] commented 1 year ago

➤ Shannon Dubay commented:

Valentine Tawira , let me know what you think. I know its annoying to add a new attribute, but I think this one is actually needed.

FYI Isaiah Lekay

sync-by-unito[bot] commented 1 year ago

➤ Shannon Dubay commented:

Valentine Tawira I have significantly changed this task. Now it is solely focused on the investigation of why the datetime was being changed, and the IDS fix we’ll apply to work around the package. I have made a new task for the data-fixing: https://pantheracorporation.atlassian.net/browse/DV-526 ( https://pantheracorporation.atlassian.net/browse/DV-526|smart-link ) FYI Isaiah Lekay

sync-by-unito[bot] commented 1 year ago

➤ Shannon Dubay commented:

Valentine Tawira and Thabied Majal and Isaiah Lekay , should I be worried that we have the following in a few places in IDS basecode:

tz = "GMT"

I couldn’t follow the code well enough to determine if it is a concern 😕

sync-by-unito[bot] commented 1 year ago

➤ Valentine Tawira commented:

Notes

Applied fix to maintain the original format of the DateTimeOriginal on database upload.

Shannon Dubay Traced the few places we are using tz = "GMT" on DateTimeOriginal through IDS and still results in an off datetime.

NB: If we are to change the server DateTimeOriginal, we will need to adjust IDS accordingly as the current DateTimeOriginal formatting in IDS will break with the server changes.

sync-by-unito[bot] commented 1 year ago

➤ Shannon Dubay commented:

Oooh, ok noted Valentine Tawira . Please can we discuss this in our sprint review on Friday? It would be good to get a walk through of the code changes surrounding this, and we’ll record it for those who aren’t on the call.

sync-by-unito[bot] commented 11 months ago

➤ Valentine Tawira commented:

Adjusted code IDS code in all modules to work with the server DateTimeOriginal modifications.

Tested with S202_20220506_20221004 and S3133_20130808_20130928 adjusted on the dev schema.

!Plots Capture.png|width=1486,height=781!

sync-by-unito[bot] commented 11 months ago

➤ Valentine Tawira commented:

  1. Replacing areas where we use "DateTimeOriginal" for "DateTimeFinal" (Isaiah and myself have already started a list of these places ( https://pantheracorporation.atlassian.net/browse/DV-526 ))
    1. Adding the new attribute into the codebase
    2. Adjust process images and historical importer to include the DateTimeFinal attribute.
    3. Adding the new attribute into the Redshift tables and filling it with the Date and Time from the FileName_New attribute (code already exists, just needs to be adjusted to consider that we are now making a new attribute) - Adjust the existing sql query to create the new DateTimeFinal attribute
    4. Creating a way for IDS to handle local data that does not yet have "DateTimeFinal" in the database
    5. Considering any unit tests that might break because of these changes