brianc / node-pg-types

Type parsing for node-postgres
270 stars 55 forks source link

Do not return DATE fields as Javascript Date #50

Closed felixfbecker closed 4 years ago

felixfbecker commented 8 years ago

A Javascript Date is always parsed in local timezone. A DATE column in Postgres or any other DB doesn't have any timezone associated with it, it is just a date. It begins and ends differently depending on where you are. Parsing this as a JS Date that has timezone associated with it is a huge problem. Let's say your server is located in GMT+2 (Germany). You insert a date into your database, like 2016-08-26. When you query that in Germany, you will get the Date object Fri Aug 26 2016 00:00:00 GMT+0200. But run toUTCString() on it, and you get Thu, 25 Aug 2016 22:00:00 GMT! The date is one day off. This imposes a huge problem for distributed applications.

TL;DR JavaScript objects are not suited to represent a DATE field, because they have a timezone and time associated, while a DATE has not.

postgres-date should not be used to parse it and instead, the string should be returned (just like with a DECIMAL). Users can register their own type parser with their own DateOnly object if they want.

This will be a breaking change and we would like to depend on this in Sequelize 4.

See https://github.com/sequelize/sequelize/issues/4858

brianc commented 8 years ago

I agree it can be clumsy to use JavaScript dates. This particular bike shed we've painted a few times in the past & kinda settled on the current behavior because it's easy enough to register your own type parser to do whatever you want with dates. I would be open to changing this behavior in the next major release for DATE (not timestamp or timestamptz) to have it return the unparsed string...but I'm not sure supplying your own type parser for dates is really that hard? Anyways this would be a pretty big breaking change so let's table this one until pg@7.0?

felixfbecker commented 8 years ago

Would be nice if we could depend on this in Sequelize 4. We have not yet implemented the connection-level type parsers (registering global type parsers should be avoided).

Of course the change should only be for DATE, not timestamp[tz].

felixfbecker commented 8 years ago

Isn't this pg-types? 😄

brianc commented 8 years ago

I am super smart. :finnadie:

brianc commented 8 years ago

Sorry was doing a bunch of node-postgres issues this morning and got confused. I'll delete the milestone & clean up the comments a bit - I'm only 1 of the maintainers here & my opinion is only 1 of several so we'll see what others weigh in on.

brianc commented 8 years ago

okay tried to fix it up! Sorry for the confusion! Toooo many browser tabs! :grin:

bendrucker commented 8 years ago

I also don't have strong feelings about this. I don't think returning a date as a Date is wrong, regardless of the time zone issues. The story in Javascript has always been that you have to be careful about time zones. I don't think it's reasonable to run toUTCString() on something you know is a pure date. If I'm using a Date to store info like that I'm using helper functions later to extract the date info in local time.

felixfbecker commented 8 years ago

The example might be a bit misleading. Consider the server is GMT+2. You query your database and then JSON.stringify() your data and send it to the client. The client is done by a company who is using your API, you have no control over it, and the user is sitting in England. Now the client gets the date string Fri Aug 26 2016 00:00:00 GMT+0200 inside its AngularJS app and wants to display this in a localized date format, so he uses a date filter to format it to DD.MM.YYYY. Now what the user sees is 25.08.2016. What happened? A very subtle but major bug got introduced. The client parsed the date string in its own timezone. If it was a simple ISO YYYY-MM-DD string, this would not have been a problem. But because it incorrectly included a time and timezone, the date is one day off in his timezone.

You could of course argue that the API should simply not have used JSON.stringify() to stringify the date, but first convert every single date to a YYYY-MM-DD string. But that is huge boilerplate code for big models and Imo the library shouldn't even use this kind of object to represent the DATE field.

bendrucker commented 8 years ago

You could of course argue that the API should simply not have used JSON.stringify() to stringify the date, but first convert every single date to a YYYY-MM-DD string.

That's my expectation here, despite the acknowledged need for extra boilerplate around serialization. Dates give you language level features like date1 > date2. Returning a string means people who wish to treat the date like a date in their code have extra boilerplate. To me it's not the database driver's responsibility to help avoid programmer error elsewhere in the application.

felixfbecker commented 8 years ago

Thanks for the feedback. So far we only heard from people that a DATE (date-only) should be represented like a date-only and not as a timestamp with timezone, see the issue I linked.

I actually never used this implicit object comparison, but always compared .getTime() values. If you want to compare them, you can easily use Date.parse() on both of them, and then it will not impose any problem with timezones, because you parse both strings in the same statement.

bendrucker commented 8 years ago

That's helpful but there's also a pretty significant confirmation bias in issues like that. You're far more likely to hear from the people who want strings and got tripped up by JS dates than the people who might be relying on a capital-D Date.

The common factor in other types that are not parsed is that they cannot be safely represented by a JS value type, e.g. int types with enough bits to be potentially greater than Number.MAX_SAFE_INTEGER.

This change would make dates inconsistent with other types, including others with potentially treacherous locale behavior (timestamp).

felixfbecker commented 8 years ago

The problem is for a column that only contains date information and nothing about the time or a timezone (therefor representing a timezone-independent 24h range), an object is used for representation that includes date, time and timezone information (therefor representing a millisecond exact point in time in a specific meridian).

Yes you can argue that you will hear more about people who complain, but I don't think that argument is 100% valid. In open source projects there are always also people who express their opinions against a proposed change, the sequelize issue is open for quite some time and so far you are the first one I met who considers this change undesirable.

I think the use case of JSON.stringifying your database object is way greater than any native comparison capability. NodeJS is very often used to build RESTful JSON APIs, and it should be possible to recursively turn objects with data that somewhere comes from the database into a logical JSON representation without having to iterate and correct every single DATE(ONLY) property. In the common API you will have dozens of endpoints that need to send back a DATE(ONLY) in JSON, but I would argue having to compare two DATE(ONLY)s is very rare, and the additional code required to do it is way less than the additional code required to correct DATE(ONLY)s before JSON.stringifying them.

Yes, custom type parsers can be used but to be honest I would feel weird making that decision inside the sequelize library and taking a high-abstraction object from the low-level database driver and converting it into a no-abstraction string inside the high-level ORM. Behavior should be consistent so the user is not confused if he comes from using plain pg. We should decide on one common way to represent DATE(ONLY) columns in Javascript, or at least make the low-level db driver delegate that decision to higher-level modules or the user by not using an abstraction.

bendrucker commented 8 years ago

Ok, fair enough. I do want to throw out a third option here. Date.parse is not 100% consistent across all environments. We could instead return a {year, month, date} object with number values and a toString method similar to intervals.

felixfbecker commented 8 years ago

Yes, I also thought about that. Basically introducing a new DateOnly object interface that represents a data that is only a date, without any time / timezone. Ideally though this should not be a postgres-date object, but some implementation that could be reused by users and all kind of libraries and other database drivers.

It only has to be decided wether this is something a low-level database driver should do, or if that should be left to the user / higher-level libraries. For example, it would be possible to always return DECIMAL columns as a BigNumber, but there are arguments against it - we would depend on a specific library and it's better to let the user handle this with a custom type parser.

I actually once published the package date-only on NPM a long time ago (which is the best package name you could have for this kind of package) that aimed to do this. But its design is flawed because it inherits from Date. A good implementation would instead provide an interface similar to Date, but only the methods relevant for the date part. I planned to release a v2 soon, so using that would be an option. But note that the native Date does not have public properties like year, month etc. A good implementation would also be "formattable" by moment.js - which actually would work through year/month/day.

But honestly at least inside sequelize we always have to keep in mind that we have to support 3 other drivers too (mysql, tedious, sqlite3) and then normalize output for users. So in an ideal world the drivers just return low-level strings for stuff that doesn't map to native JS types (string, number, boolean, Date, Buffer, Array, Object) so we dont have to normalize (because those types are the only ones that all drivers can agree on easily). Just like DECIMALs are not returned as a BigNumber, even though everyone who actually uses DECIMALs probably is doing some monetary stuff and is using that library, because it is the only and best BigNumber implementation for JavaScript.

So basically two options here, agree on a standard way to represent DateOnly values either with an interface or a class instance (that can still implement the object literal interface), or use a string and let users / libaries set custom type parsers.

We also have to keep in mind that anything that is outputted also needs to be accepted as input.

tomaspinho commented 8 years ago

We are using pg-promises and ended up overriding pg-types's deserialization for Date types.

bendrucker commented 8 years ago

We can make this change on a future major bump. Anyone care to do a PR for the right types/tests?

felixfbecker commented 8 years ago

@tomaspinho doing that in userland right now would break an ORM that relies on the dates being Date.

@bendrucker So what representation shall we use?

bendrucker commented 8 years ago

Let's go with YYYY-MM-DD since that's specified in ISO 8601

tomaspinho commented 8 years ago

Please see PR #53

georgiana-b commented 7 years ago

I encountered the issue described by @felixfbecker and I share his opinion that a date-only field should not have any time information attached to it. The current behavior is very troublesome. @brianc #53 looks fine. Could you please take a look?

shanehughes3 commented 7 years ago

What's the status of this? I see on #53 that it was decided to make sure this would be handled as a breaking change for node-postgres, but I don't see any mention on that repo. Are there further changes needed?

abenhamdine commented 7 years ago

I think it's planned for pg@7.0, https://github.com/brianc/node-postgres/milestone/3, but not totally sure about it.

zam6ak commented 6 years ago

FWIW, we always have to set the type parser to return string. To me, returning ISO 8601 string seems most flexible.

danrasmuson commented 6 years ago

For future visitors. If you want to turn off node-postgres's date parsing....

node-postgres will convert instances of JavaScript date objects into the expected input value for your PostgreSQL server. Likewise, when reading a date, timestamp, or timestamptz column value back into JavaScript, node-postgres will parse the value into an instance of a JavaScript Date object.

You can add this code to override its behavior...

import { types } from 'pg';
const TIMESTAMP_OID = 1114;
types.setTypeParser(TIMESTAMP_OID, (timestamp) => timestamp);
ezzatron commented 5 years ago

To add to the above comment, the OID is different for timestamp with time zone, so you might want to disable parsing for both:

const TYPE_TIMESTAMP = 1114
const TYPE_TIMESTAMPTZ = 1184

const noParse = v => v

types.setTypeParser(TYPE_TIMESTAMP, noParse)
types.setTypeParser(TYPE_TIMESTAMPTZ, noParse)
maksimluzik commented 5 years ago

OID for the DATE is 1082

if someone wants to return only date:

const { types } = require('pg');
const TYPE_DATESTAMP = 1082;
types.setTypeParser(TYPE_DATESTAMP, date => date);
randolpho commented 4 years ago

@brianc

Given that this is a relatively frequent issue, might it be wise to add the workarounds provided here into the main documentation?

mariusa commented 4 years ago

ping. Got hit by this too. Thanks for the workarounds

sfarshi-varicent commented 4 years ago

Any update on this one. I have tried the workaround to disable the parser for Date OID 1082. However; I have the similar issue. I have a test case where I insert a Date say '2020-08-21', depending on which time of the day I run the test, I either get '2020-08-21' or '2020-08-22' (affected by timezone). I just need to get what I have in database '2020-08-21' with no manipulation.

sfarshi-varicent commented 4 years ago

Please ignore my above comment. The issue was due to an oversight on my part. I am using a different workaround of using to_char(...) function to convert Date into String as an alternative to above. However; it would be great if Date are returned as strings and require no workaround solution

shaunhurley commented 2 years ago

Had a question relating to disabling the parsing, specifically for the DATE type (1082). What is the default parser / what would need to be applied to reenable the default (Javascript Date) behavior? Is it just using the parseTimestampUTC method defined in binaryParsers.js? Or is it as simple as (date) => new Date(date)? I couldn't find an explicit reference for 1082 except in the exports in builtins.js...