xo / usql

Universal command-line interface for SQL databases
MIT License
8.94k stars 352 forks source link

Discussion: Metadata; internal to external data type conversion #360

Open Allam76 opened 2 years ago

Allam76 commented 2 years ago

Following out discussion in #359 and #358 here comes a possible implementation. Note this is not intended to be a ready fix but rather an idea on what I have in mind.

I've added 2 new fields to the metadata.Column. Internal and external data type. External is the one that is well, external and should be uniform between databases. The internal comes from the fact that most databases allow to write the same data type in many different format. VCHAR, VARCHAR, VARIABLE CHARACTER, etc all are the same type. This is the underlying type to reduce the number entries in the mapping.

It is implemented for only two databases (postgres and sqlite) that covers a bit of the spectrum of available databases.

I'm also agreeing that the metadata maybe should be extracted into a more general purpose package.

nineinchnick commented 2 years ago

What is the purpose of having unified types? Would it be used only for display in the describe commands (\d) or for formatting values when presenting query results? This is kind of a slippery slope, because you'd have to be careful when mapping types to make sure they match, that is any conversion is not lossy. This is true even when only displaying them to users.

If there are equivalent types in a database, like PostgreSQL, maybe consider normalizing them inside a single metadata reader?

Allam76 commented 2 years ago

I have two use cases: 1) Federated SQL engine. This is an engine without own storage. So it uses other (several) databases underneath. Like one database to rule them all. Presto/Trino is an example in the enterprise sphere. This engine expects a metadata format and writing one mapping per individual database is tedious.

2) Migration. I have an ancient database with 14 K tables to migrate with metadata in yet another database so total databases to handle is three. Then having unified metadata is a great plus.

I'll also have a look at your xo for the migration. This could potentially be helpful.

nineinchnick commented 2 years ago

I'm a Trino contributor and I work for Starburst Data. I wrote a few Trino connectors so I have hands on experience with data type mapping.

You have not answered my questions why unifying types helps in those use cases. Right now it looks like you only need to read types. My concern is that you can't simply define type aliases, without also defining how to convert data to avoid loosing precision.

Allam76 commented 2 years ago

Aha ok, I understand your concern now. I was simple assuming from my exploration that the rest of the data needed to avoid loosing precision is available as is. That is, only external data type was needed to be added.

If not, then indeed I have a bigger problem.

nineinchnick commented 2 years ago

Maybe we could report the Go's scan type, but I have a very subjective feeling most drivers don't implement this at all. Since there are very few Go types, I'm not sure if that'll be useful.