jackc / pgtype

MIT License
308 stars 110 forks source link

Adds support for scientific notation. #92

Closed teepark closed 3 years ago

teepark commented 3 years ago

Connecting with pgx to a postgresql 10.7, a numeric(27, 12) column containing 600 was sending the string 600000000000000e-12. This supports such a string in parseNumericString and adds a test.

jackc commented 3 years ago

Do you have a SQL example of PG sending a number back in scientific notation? I can't figure out how to do it.

...

And looking at the PG source code I don't see where it would be happening.

numeric_out builds the output text value for numeric:

https://github.com/postgres/postgres/blob/5bca69a76b3046a85c60c48271c1831fd5affa51/src/backend/utils/adt/numeric.c#L733-L766

It calls get_str_from_var which seems to always use decimal notation:

https://github.com/postgres/postgres/blob/5bca69a76b3046a85c60c48271c1831fd5affa51/src/backend/utils/adt/numeric.c#L6988-L7124

teepark commented 3 years ago

https://gist.github.com/teepark/e3da365d2963052385d149805146cd14

The code in that gist package works fine with this patch.

jackc commented 3 years ago

So the underlying problem is a bit convoluted.

  1. pgx supports using database/sql interfaces Scan and Value to decode and decode values.
  2. It also uses the binary format by default.
  3. There is no way for pgx to know ahead of time that it will be scanned into a database/sql style type.
  4. The numeric value is fetched in binary format but needs to be converted to text for database/sql Scan compatibility.
  5. pgx uses pgtype.Numeric.EncodeText to do this conversion.
  6. PostgreSQL supports scientific notation as an input text format.
  7. pgtype.Numeric.EncodeText uses scientific notation to send text formatted numeric values to PostgreSQL.
  8. This leads to your Scan function getting a scientific notation string for a numeric even though PostgreSQL will never send that format.

Perhaps pgtype.Numeric should not EncodeText in scientific notation, but if a change is to be made that would be preferred over parsing something that PostgreSQL never sends.

However, for your specific case I don't think either is necessary. Your Scan function could directly use the SetString method of your decimal library.

And on another note, if you are using pgx directly instead of through database/sql you should probably implement (Encode|Decode)(Text|Binary) instead of Scan and Value. That could be significantly more efficient.

See https://github.com/jackc/pgx/wiki/Numeric-and-decimal-support and https://github.com/jackc/pgtype/tree/master/ext/shopspring-numeric for an example of how the Shopspring decimal package is integrated.

teepark commented 3 years ago

I see, this makes total sense @jackc thank you. I really appreciate the feedback.

It still makes me nervous that something in a postgres driver library could change and I might get something different provided to Scan as the interface{} argument, so I'm keeping the Numeric use (the major value add there is how comprehensive it is) but adding a fast path specific to strings.