Closed derekmahar closed 4 years ago
The behavior is as expected by csvq. In some cases, in the result set of csvq, one column may contain different types.
In the processing of csvq, all values in csv files are initially read as strings. The value types are converted only if the values are calculated. To prevent unexpected type conversion for users, implicit type conversion is not performed except in that case.
In this query, TotalQuantity
of the first row and other columns are specified with only the column name in the file quantities.csv
, so these values are displayed as string values.
The values of second row and after in the column Total Quantity
are converted to integer values by the expression q.Quantity + tq.TotalQuantity
, so these values are right justified.
To unify the value types in the result set, you can use cast functions, such as INTEGER(), in the select clause.
WITH ordered_quantities AS (
SELECT *
FROM quantities
ORDER BY Timestamp
),
RECURSIVE total_quantities(Timestamp, Quantity, TotalQuantity) AS (
SELECT
Timestamp,
Quantity,
Quantity
FROM ordered_quantities
WHERE Timestamp = 1
UNION ALL
SELECT
q.Timestamp,
q.Quantity,
q.Quantity + tq.TotalQuantity
FROM quantities q
JOIN total_quantities tq
ON tq.Timestamp = q.Timestamp - 1
)
SELECT
Timestamp,
Quantity,
INTEGER(TotalQuantity) AS TotalQuantity
FROM total_quantities;
Thank you for your answer.
How can one specify the type of a column in a CSV?
I modified my query slightly differently than you did in your solution such that I impose the column types in the first table in the CTE and then sort the rows in a second table:
WITH typed_quantities AS (
SELECT
INTEGER(Timestamp) AS Timestamp,
INTEGER(Quantity) AS Quantity
FROM quantities
),
ordered_typed_quantities AS (
SELECT *
FROM typed_quantities
ORDER BY Timestamp
),
RECURSIVE total_quantities(Timestamp, Quantity, TotalQuantity) AS (
SELECT
Timestamp,
Quantity,
Quantity
FROM ordered_typed_quantities
WHERE Timestamp = 1
UNION ALL
SELECT
q.Timestamp,
q.Quantity,
q.Quantity + tq.TotalQuantity
FROM ordered_typed_quantities q
JOIN total_quantities tq
ON tq.Timestamp = q.Timestamp - 1
)
SELECT *
FROM total_quantities;
The types of values are converted when the values are calculated as necessary.
For example, in arithmetic operations, the values will be converted to integer or float. When using a function, the values will be converted to suitable types for that function. If that value cannot be converted to the suitable type, the value will be converted to Null.
e.g.
SELECT DATETIME(q.Timestamp) AS Datetime, -- Datetime
INTEGER(q.Quantity) AS Quantity, -- Integer
q.Quantity || ' pieces' AS Pieces, -- String
q.Quantity / (SELECT SUM(q2.Quantity) FROM `quantities.csv` q2) AS Percentage -- Float
FROM `quantities.csv` q;
Please refer to the following link for details about type conversion. https://mithrandie.github.io/csvq/reference.html https://mithrandie.github.io/csvq/reference/value.html#automatic_type_casting
For the following recursive CTE query and CSV input file,
csvq
incorrectly or inconsistently right justifies columnTotalQuantity
in all but the first row of the result set, but correctly (or consistently) left justifies the numbers in the first two columns.Should
csvq
left or right justify numbers in the columns of a result set?Query
select-running-total-quantity.sql
:Input file
quantities.csv
:Result set:
Notice that in column
TotalQuantity
only the first row is left justified as are all of the numbers in columnsTimestamp
andQuantity
.More importantly, however, the good news the result of the query is correct!