Closed jamesaoverton closed 3 months ago
I've run into this issue before and it is annoying. I think that adding a format column to the datatype table makes a lot of sense. I assume it will be optional?
Unfortunately the syntax of format()
is very different in sqlite and postgresql. While sqlite supports more or less the same options as C's printf function, PostgreSQL does not, instead only supporting a very limited syntax which is not very useful for the purposes of this issue.
However, there is a rust equivalent of C's printf function that is packaged in a separate crate: https://docs.rs/sprintf/latest/sprintf/. The advantage of using rust is that we avoid having to worry about syntax differences between different database engines.
Ok, good to know. I was hoping to push this down into the SQL layer, including the *_text
views, so that users could just make SQL queries instead of running our Rust code. But it's more important for our format configuration to be consistent across SQLite and Postgres.
If we're going to use Rust, then I'd like to consider alternatives to printf syntax, which I find inscrutable.
After discussing this on our call, I'm convinced that printf
syntax is the best option. Despite being ugly, it's widely supported, which means people can use the formatting configuration from SQLite or Python or whatever.
Related to #42, we need to be able to roundtrip tables with numbers. For example, in the penguins test the "culmen_length" column has datatype "positive_decimal", which expects a decimal point and has SQL type "NUMERIC". The penguins.tsv has values like "50.0". By default, SQLite will display this as "50", so VALVE
save_table()
will save it as "50", which will then fail to validate if loaded again.SQL provides a
format()
function, based onprintf
. I findprintf
format strings ugly and unintuitive, but they are widely used, widely supported, and quite flexible. There might be some variation, however, which could cause problems.We should consider allowing a "format" column for the "datatype" table that would specify a SQL format string, then use it in
save_table()
.First I'd just like to hear your thoughts @lmcmicu.