when we onboard a new datasource, we get column metadata which we obtain from duckdb using a DESCRIBE statement
the column metadata controls the options in the AttributeUi. Each column is always available as itself, but its data type determines which aggregators and derivations are available.
derivations typically have their own, fixed column type.
some aggregators also have a fixed column type. Example: STDEV
some aggregators will (should) always have a column type identical to the column type of the attribute. Example: MIN/MAX/
some aggregators are always in a particular family of types, example: COUNT will always be some positive integer type
some aggregators have a column type similar to that of the attribute, but with a different precision. Example: SUM
When we execute a query to populate the axes/tuples, we obtain a resultset as an "Arrow table". This is where it becomes interesting:
the mapping from duckdb column type to arrow data type may sometimes be surprising. For example, a duckdb HUGEINT is mapped to an Arrow Decimal, presumably because the largest integer type in Arrow is not precise enough to represent duckdb HUGEINTs. See https://github.com/duckdb/duckdb-wasm/issues/1579
the mapping from Arrow types to javascript types may also be surprising. In the aforementioned example, the ArrowDecimal is represented in javascript as a Uint32[4]. We have also seen an Arrow Timestamp (coming from a duckdb TIMESTAMP, which is reasonable) whose javascript representation is a Number primitive (https://github.com/duckdb/duckdb-wasm/issues/1563#issuecomment-1878745744). The reasoning in this case is that the Number represents the number of milliseconds from epoch and can thus be used to instantiate a javascript Date object, which would approximate the duckdb TIMESTAMP value. The reason for not returning a javascript Date instead is that the duckdb TIMESTAMP has microseconds precision (in fact it is defined as the number of microseconds after epoch), whereas the javascript Date is only in milliseconds after epoch. By returning a Number that includes 3 decimals, no precision is lost while the conversion to javascript Date is still conveniently easy.
Arrow has a concept called "storage type" and a given logical arrow type may be represented by several different storage types, depending on parameters;
Once we got the value from the resultset, we often need to format it so it can be presented in the UI. For formatting of numbers and dates we use the Intl.NumberFormat or Intl.DateTimeFormat instances and those accept javascript values
The real trouble starts when we need to use the values returned by a resultset as values for parameters in a prepared statement. There is absolutely no guarantee that this will work. Even a relatively simple case, where we receive a Javascript BigInt instance from the resultset fails when we try to bind it (see: https://github.com/duckdb/duckdb-wasm/issues/1563)
We have the following type systems to deal with:
the relationships between these systems:
When we execute a query to populate the axes/tuples, we obtain a resultset as an "Arrow table". This is where it becomes interesting:
Once we got the value from the resultset, we often need to format it so it can be presented in the UI. For formatting of numbers and dates we use the Intl.NumberFormat or Intl.DateTimeFormat instances and those accept javascript values
The real trouble starts when we need to use the values returned by a resultset as values for parameters in a prepared statement. There is absolutely no guarantee that this will work. Even a relatively simple case, where we receive a Javascript BigInt instance from the resultset fails when we try to bind it (see: https://github.com/duckdb/duckdb-wasm/issues/1563)