ghdna / athena-express

Athena-Express can simplify executing SQL queries in Amazon Athena AND fetching cleaned-up JSON results in the same synchronous or asynchronous request - well suited for web applications.
https://www.npmjs.com/package/athena-express
MIT License
179 stars 70 forks source link

Extend data types support for UTC dates, i.e. `Timestamp` Athena type #49

Open ssedano opened 3 years ago

ssedano commented 3 years ago

Hi team,

I'd like to extend data types support for UTC dates, i.e. Timestamp Athena type. My proposal is to create a new option, e.g. useUtcDates, to cast values of columns with type Timestamp to Date Javascript objects. This setting would have a default value of false. The caveat of this setting is that it will use machine-local time, thus this setting is intended for systems configured in UTC. This could be extended to Date Type but it would transform dates into date and times, e.g. '2020-11-18' would become '2020-11-18T00:00:00Z'. I intentionally avoid casting timestamp with time zone Athena type to reduce complexity.

For applications that deal with dates, converting Timestamp columns to Date objects is a common task given that many libraries expect Date objects. Moreover, athena-express hides the type in the result set. This forces developers that wish to operate using dates to access the column by name and explicitly convert values to Date objects.

I'd be willing to submit a patch if this feature request is approved.

Proposed update to documentation:

ghdna commented 3 years ago

I somehow missed responding to this. If you wanna go ahead and add that, I'll be happy to take a look. Just consider any edge cases that might break it.

ghdna commented 3 years ago

Actually the more I think about this, the more it can cause more confusion for users since there are several date formats to convert into: For e.g. Athena's TIMESTAMP datatype will fetch results as 2008-09-15 03:04:05.327 This string can be converted into 5 formats:

  1. new Date() => Mon Sep 15 2008 03:04:05 GMT-0400 (Eastern Daylight Time)
  2. .toUTCString() => "Mon, 15 Sep 2008 07:04:05 GMT"
  3. .toDateString() => "Mon Sep 15 2008"
  4. .toISOString() => "2008-09-15T07:04:05.327Z"
  5. .toTimeString() => "03:04:05 GMT-0400 (Eastern Daylight Time)"

How would AthenaExpress know which format the user cares about? And the bigger question is, should this be something AthenaExpress does or should this functionality be handed over to the user to decide.

ssedano commented 3 years ago

Thank you for your reply.

should this be something AthenaExpress does or should this functionality be handed over to the user to decide.

Is the right question to ask. Unfortunately, query method doesn't include the data type. Hence why I propose to let the user decide at the time of configuring AthenaExpress. This saves the user from tracking column types separately and casting objects, keeping AthenaExpress' simplicity.

ghdna commented 3 years ago

ok i understand. Let me check on the effort required for this

ssedano commented 3 years ago

I would be happy to contribute the patch

Thank you.

ghdna commented 3 years ago

Sure, please do. I'd be happy to look at it.

ssedano commented 3 years ago

Apologies for the delay on this. I expect to have a pull request ready in the next two weeks