pnnl / socialsim_package

Other
17 stars 25 forks source link

Datetime not always converted correctly after load_data #1

Closed brandon-fsu closed 5 years ago

brandon-fsu commented 5 years ago

Steps to Reproduce:

  1. Use extract_ground_truth script to assemble a ground truth dataset with multiple platforms
  2. Load the ground truth data using socialsim.load_data
  3. Reddit Data is incorrect and listed as 1970-01-01 00:00:01.501573047+00:00

I believe this issue arises because the convert_datetime function doesn't happen on a per-platform basis, but is done on the entire dataframe, and since the different platforms have different formats, one solution doesn't fit all.

emily-grace commented 5 years ago

There was an update last week to the data extraction script in the old repo that converts the datetimes to a common format for all platforms before writing out the JSON file which should resolve the issue.

I think we will assume that the input data to the measurements package is all in the same datetime format, because it is much slower to process datetimes line-by-line to accommodate different formats. We will look at adding a validation check for varied formats in the input.