duckdb / duckdb-web

DuckDB website and documentation
https://duckdb.org
MIT License
144 stars 278 forks source link

Node JS API docs: would be nice to have a page that describes the mapping of the DuckDB data types to the JavaScript type system #2998

Open rpbouman opened 1 month ago

rpbouman commented 1 month ago

The Node JS api docs describe how to do a query, how to access a row in a result, and how to access a column value. However it does not describe what javascript type is used to represent the column value.

I suggest to have a separate page in the Node JS API section and call it Types or Type Mapping.

As inital documentation I suggest something like:

The following table explains how values in Query results are represented as JavaScript values:

DuckDB type JS primitive Object
ARRAY Array
BIGINT bigint
BIT string
BLOB Uint8Array
BOOLEAN boolean
DATE Date
DECIMAL number
DOUBLE number
ENUM string
HUGEINT bigint
INTEGER number
INTERVAL Object
LIST Array
MAP string
REAL number
SMALLINT number
STRUCT Object
TIME string
TIMESTAMPTZ Date
TIMESTAMP Date
TINYINT number
UBIGINT bigint
UHUGEINT bigint
UINTEGER number
USMALLINT number
UTINYINT number
UUID string
VARCHAR string

Values of the DuckDB INTERVAL type are represented by JavaScript Objects with these number properties:

Values of the DuckDB TIME type are represented by JavaScript strings in this format: %H:%M:%S.%f (see https://duckdb.org/docs/sql/functions/dateformat#format-specifiers)

Values of the DuckDB TIMESTAMPTZ and TIMESTAMP type are represented by JavaScript Date instances. Because JavaScript dates only have millisecond precision and the DuckDB values are truncated (not rounded!) to fit the JavaScript Date type.

Values of the DuckDB MAP type are represented as JavaScript strings, consisting of a pair of curly braces containing a comma-separated list of map entries. Each map entry consists of the key, followed by the equals sign (=), followed by the value.

Values of the DuckDB STRUCT type are represented as JavaScript Objects, and each of its entries is represented as a JavaScript object property.

NULL value

DuckDB NULL-values are represented as JavaScript null.

Special Numeric values

(https://duckdb.org/docs/sql/data_types/numeric#floating-point-types)

DuckDB Value JavaScript Value
Infinity Infinity
-Infinity -Infinity
NaN NaN

Special Date values

(https://duckdb.org/docs/sql/data_types/date#special-values)

DuckDB Value JavaScript Value
Infinity Invalid Date
-Infinity Invalid Date
epoch Thu Jan 01 1970 01:00:00 GMT+0100
rpbouman commented 1 month ago

For reference, piece of code that I used to find this out:

const duckdb = require('duckdb');

const db = new duckdb.Database(':memory:');
const sql = `
  select
    1::BIGINT                                     as "_bigint"          --  INT8, LONG  signed eight-byte integer
  , '01'::BIT                                     as "_bit"             --  BITSTRING   string of 1s and 0s
  , 'ab'::BLOB                                    as "_blob"            --  BYTEA, BINARY, VARBINARY    variable-length binary data
  , 1::BOOLEAN                                    as "_boolean"         --  BOOL, LOGICAL   logical boolean (true/false)
  , current_date                                  as "_date"            --  DATE        calendar date (year, month day)
  , 'epoch'::DATE                                  as "_epoch"
  , 'infinity'::DATE                              as "_infinity_date"
  , '-infinity'::DATE                              as "_-infinity_date"
  , 3.1415::REAL                                    as "_real"            --  FLOAT4, FLOAT single precision floating-point number (4 bytes)
  , 3.1415::DECIMAL(5,4)                            as "_decimal(5,4)"     
  , 3.1415::DOUBLE                                as "_double1"           --  FLOAT8,   double precision floating-point number (8 bytes)
  , (0.1 * 0.2)::DECIMAL(19,18)                   as "_decimal(19,18)"
  , (0.1 * 0.2)::DECIMAL(19,18)                   as "_decimal(19,18)"
  , 5.1::DOUBLE * 100                             as "_double"
  , 5.1::DECIMAL(2,1) * 100                       as "_decimal(2,1)"
  , 1::HUGEINT                                      as "_hugeint"          --  signed sixteen-byte integer
  , 2::INTEGER                                    as "_integer"          --  INT4, INT, SIGNED  signed four-byte integer
  , 'Infinity'::DOUBLE                                    as "_infinity"
  , '-Infinity'::DOUBLE                                     as "_-infinity"
  , 'NaN'::DOUBLE                                           as "_NaN"
  , null::double                                  as "_null"
  , INTERVAL 1 second                             as "_interval1s"         --  date / time delta
  , INTERVAL 1 year                               as "_interval1y"         --  date / time delta
  , 42::SMALLINT                                    as "_smallint"         --  INT2, SHORT  signed two-byte integer
  , TIME   '1992-09-20 11:30:00.123756'             as "_time1"             --  time of day (no time zone)
  , TIME   '1992-09-20 01:30:00.123456'             as "_time2"             --  time of day (no time zone)
  , TIMESTAMPTZ '1992-09-20 11:30:00.123756789'   as "_timestamptz"      -- TIMESTAMPTZ combination of time and date that uses the current time zone
  , TIMESTAMP '1992-09-20 11:30:00.123999999'       as "_timestamp"        --  DATETIME combination of time and date
  , 1::TINYINT                                    as "_tinyint"          -- INT1    signed one-byte integer
  , 1::UBIGINT                                      as "_ubugint"          -- unsigned eight-byte integer
  , 1::UHUGEINT                                     as "_uhugeint"         -- unsigned sixteen-byte integer
  , 1::UINTEGER                                     as "_uinteger"         -- unsigned four-byte integer
  , 1::USMALLINT                                      as "_usmallint"        -- unsigned two-byte integer
  , 1::UTINYINT                                     as "_utinyint"         -- unsigned one-byte integer
  , gen_random_uuid()::UUID                         as "_uuid"             -- UUID data type
  , 'bla'::VARCHAR                                as "_varchar"
  , [1, 2, 3]                                     as "_list"
  , MAP {'key1': 1, 'key2': 2}            as "_map"
  , {'x': 1, 'y': 2, 'z': 3}                      as "_struct"
  , 'a'::ENUM('a', 'b')                           as "_enum" 
  , [1::BIGINT]                                   as "_[bigint]"          --    INT8, LONG  signed eight-byte integer
  , ['01'::BIT]                                   as "_[bit]"             --    BITSTRING   string of 1s and 0s
  , ['ab'::BLOB]                                    as "_[blob]"            --  BYTEA, BINARY, VARBINARY    variable-length binary data
  , [1::BOOLEAN]                                    as "_[boolean]"         --  BOOL, LOGICAL   logical boolean (true/false)
  , [current_date]                                as "_[date]"            --  DATE      calendar date (year, month day)
  , [3.1415::REAL]                                as "_[real]"            --  FLOAT4, FLOAT single precision floating-point number (4 bytes)
  , [3.1415::DECIMAL(5,4)]                        as "_[decimal(5,4)]"     
  , [3.1415::DOUBLE]                                as "_[double1]"           --  FLOAT8,   double precision floating-point number (8 bytes)
  , [(0.1 * 0.2)::DECIMAL(19,18)]                 as "_[decimal(19,18)]"
  , [5.1::DOUBLE * 100]                           as "_[double]"
  , [5.1::DECIMAL(2,1) * 100]                     as "_[decimal(2,1)]"
  , [1::HUGEINT]                                      as "_[hugeint]"          --  signed sixteen-byte integer
  , [2::INTEGER]                                    as "_[integer]"          --  INT4, INT, SIGNED  signed four-byte integer
  , [INTERVAL 1 year]                             as "_[interval1y]"         --  date / time delta
  , [42::SMALLINT]                                as "_[smallint]"         --  INT2, SHORT  signed two-byte integer
  , [TIME '1992-09-20 01:30:00.123756']             as "_[time1]"             --  time of day (no time zone)
  , [TIME '1992-09-20 11:30:00.123456']             as "_[time2]"             --  time of day (no time zone)
  , [TIMESTAMPTZ '1992-09-20 11:30:00.123456789'] as "_[timestamptz]"      --   TIMESTAMPTZ combination of time and date that uses the current time zone
  , [TIMESTAMP '1992-09-20 11:30:00.123456789']   as "_[timestamp]"        --  DATETIME combination of time and date
  , [1::TINYINT]                                    as "_[tinyint]"          -- INT1    signed one-byte integer
  , [1::UBIGINT]                                      as "_[ubugint]"          -- unsigned eight-byte integer
  , [1::UHUGEINT]                                     as "_[uhugeint]"         -- unsigned sixteen-byte integer
  , [1::UINTEGER]                                     as "_[uinteger]"         -- unsigned four-byte integer
  , [1::USMALLINT]                                  as "_[usmallint]"        -- unsigned two-byte integer
  , [1::UTINYINT]                                     as "_[utinyint]"         -- unsigned one-byte integer
  , [gen_random_uuid()::UUID]                         as "_[uuid]"             -- UUID data type
  , ['bla'::VARCHAR]                              as "_[varchar]"
`;
db.all(sql, function(err, res) {
  if (err) {
    console.warn(err);
    return;
  }
  const row = res[0];
  const keys = Object.keys(row);

  for (let i = 0; i < keys.length; i++){
    const key = keys[i];
    const value = row[key];
    let type = typeof value;
    if (key.startsWith('_interval')) {
      type += ' ' + JSON.stringify(Object.keys(value).reduce(function(acc, curr){
        acc[curr] = typeof value[curr];
        return acc ;
      }, {}));
    }
    const ctor = value ? value.constructor : undefined;
    console.log('column: ', key, '; type: ', type, '; value: ', value, '; constructor: ', String(ctor));
    if (value instanceof Date) {
      console.log(value.getMilliseconds());
    }
  }
  //console.log(obj);
});