xerial / sqlite-jdbc

SQLite JDBC Driver
Apache License 2.0
2.83k stars 616 forks source link

JDBC reports wrong type INTEGER but value is LONG #604

Closed sproket closed 2 years ago

sproket commented 3 years ago

using 3.34.0 sqlite-jdbc

Create the following table

    sql = "CREATE TABLE DateTestSQLTypes ( " +
            " ID INT, " +
            " Description VARCHAR(100), " +
            " DateOnly DATE, " +
            " TimeOnly TIME," +
            " UtilDateAndTime DATETIME, " +
            " DateAndTime DATETIME) ";

With JDBC insert a row. Then read back the row.

using ResultsetMetaData look at the types (for TimeOnly) column.

JDBC java.sql.Types reports 4 = INTEGER

value read back with getObject is actually a LONG.

If you read back with getInt() you get an integer wrapped back as a negative number (wrong value).

If the value is LONG the java.sql.Types should be -5 BIGINT

DamjanJovanovic commented 2 years ago

Yes this is a highly annoying bug. A workaround to get the real value out of the database is to CAST(columnName AS VARCHAR(50)) in your SQL query, and then read that value as a string, and convert the string to long in Java.

The sqlite ODBC driver has the same problem, but Python's sqlite3 package seems to work and doesn't wrap large positive integers to negative values.

DamjanJovanovic commented 2 years ago

In the ODBC driver you can apparently set the "BigInt=true" connection parameter, which makes all integer come in as BIGINT. What does this mean though? That sqlite doesn't tell its users what the real integer sizes are, and you have to force a particular size?

gotson commented 2 years ago

the root of the issue is that SQLite is not strongly typed, see https://www.sqlite.org/datatype3.html

A storage class is more general than a datatype. The INTEGER storage class, for example, includes 7 different integer datatypes of different lengths. This makes a difference on disk. But as soon as INTEGER values are read off of disk and into memory for processing, they are converted to the most general datatype (8-byte signed integer). And so for the most part, "storage class" is indistinguishable from "datatype" and the two terms can be used interchangeably.

When reading that, i feel like we should always return Long, never Int for the INTEGER columns.

As for date/time, they are stored as INTEGER in SQLite:

SQLite does not have a storage class set aside for storing dates and/or times. Instead, the built-in Date And Time Functions of SQLite are capable of storing dates and times as TEXT, REAL, or INTEGER values:

TEXT as ISO8601 strings ("YYYY-MM-DD HH:MM:SS.SSS"). REAL as Julian day numbers, the number of days since noon in Greenwich on November 24, 4714 B.C. according to the proleptic Gregorian calendar. INTEGER as Unix Time, the number of seconds since 1970-01-01 00:00:00 UTC. Applications can choose to store dates and times in any of these formats and freely convert between formats using the built-in date and time functions.

We should apply the same logic, and always return Long.

gotson commented 2 years ago

I just tried, and all of DATE, TIME, and DATETIME are actually returned as type 12 (VARCHAR) at the moment.

gotson commented 2 years ago

After some more digging in 3.1. Determination Of Column Affinity, the affinity for date and time columns should be NUMERIC. This is well applied in the code already.

Then in 3. Type Affinity:

A column with NUMERIC affinity may contain values using all five storage classes. When text data is inserted into a NUMERIC column, the storage class of the text is converted to INTEGER or REAL (in order of preference) if the text is a well-formed integer or real literal, respectively. If the TEXT value is a well-formed integer literal that is too large to fit in a 64-bit signed integer, it is converted to REAL. For conversions between TEXT and REAL storage classes, only the first 15 significant decimal digits of the number are preserved. If the TEXT value is not a well-formed integer or real literal, then the value is stored as TEXT. For the purposes of this paragraph, hexadecimal integer literals are not considered well-formed and are stored as TEXT. (This is done for historical compatibility with versions of SQLite prior to version 3.8.6 2014-08-15 where hexadecimal integer literals were first introduced into SQLite.) If a floating point value that can be represented exactly as an integer is inserted into a column with NUMERIC affinity, the value is converted into an integer. No attempt is made to convert NULL or BLOB values.

The database metadata can only provide information on the affinity type, which is set at the column level. The actual storage class will depend on each value, and cannot be known by looking at the table info.

I think the current behaviour of returning VARCHAR for date/time columns is the most accurate according to the capacities of SQLite.

gotson commented 2 years ago

I just worked on #532, i think for this problem you should be looking at ResultSetMetaData.getColumnType() as there is some type detection there.

gotson commented 2 years ago

My bad, I didn't read the issue correctly.

Could you provide a sample of the data you insert?

sproket commented 2 years ago

You can grab a copy of Persism here https://github.com/sproket/Persism The BaseTest has a testAllDates method. If you run TestSQLite you can see what's happening in the Converter class. Thanks for looking at this!

gotson commented 2 years ago

Thanks, but ideally I'm looking at some sample posted here. I don't have time to skim through external projects and thousand of lines files to find a reproduction sample.

DamjanJovanovic commented 2 years ago

Thanks, but ideally I'm looking at some sample posted here. I don't have time to skim through external projects and thousand of lines files to find a reproduction sample.

        try (
                Connection connection = DriverManager.getConnection("jdbc:sqlite://path/to/any/db.sqlite");
                Statement statement = connection.createStatement();
                ) {

            ResultSet rows = statement.executeQuery("SELECT FALSE, 1, 3900000000, CAST(3900000000 AS BIGINT), CAST(3900000000 AS VARCHAR(50))");
            ResultSetMetaData metadata = rows.getMetaData();
            System.out.println("Column 1 type is " + metadata.getColumnType(1) + " and signed " + metadata.isSigned(1));
            System.out.println("Column 2 type is " + metadata.getColumnType(2) + " and signed " + metadata.isSigned(2));
            System.out.println("Column 3 type is " + metadata.getColumnType(3) + " and signed " + metadata.isSigned(3));
            System.out.println("Column 4 type is " + metadata.getColumnType(4) + " and signed " + metadata.isSigned(4));
            System.out.println("Column 5 type is " + metadata.getColumnType(5) + " and signed " + metadata.isSigned(5));
        }

Output:

Column 1 type is 4 and signed true
Column 2 type is 4 and signed true
Column 3 type is 4 and signed true
Column 4 type is 4 and signed true
Column 5 type is 12 and signed false

Column 3 should either be unsigned integer, or signed bigint. Column 4 should be bigint.

gotson commented 2 years ago

@DamjanJovanovic thanks, that's useful.

I found out that the code would check the actual value to determine whether it's a Long or Integer, but only within getObject. I added the same behaviour in getColumnClassName recently.

However it was not done in getColumnType.

While looking at that, i am not very sure about the isSigned behaviour. I checked in SQLite, and it doesn't seem to explicitly support unsigned integers though. Any thought on that?

gotson commented 2 years ago

Could you try the latest snapshot and check if it's better?

gotson commented 2 years ago

@sproket @DamjanJovanovic any luck trying the latest snapshot?

DamjanJovanovic commented 2 years ago

I tried the latest Git.

With the above test: Column 1 type is 4 and signed true Column 2 type is 4 and signed true Column 3 type is -5 and signed true Column 4 type is -5 and signed true Column 5 type is 12 and signed false

which is better.

However when I select from a table containing a number which is outside the bounds of a 32 bit integer, column type is -5, but "signed" is always false, even for negative numbers!

sproket commented 2 years ago

@sproket @DamjanJovanovic any luck trying the latest snapshot?

I'll be able to try this weekend.

gotson commented 2 years ago

However when I select from a table containing a number which is outside the bounds of a 32 bit integer, column type is -5, but "signed" is always false, even for negative numbers!

Can you provide a sample test?

sproket commented 2 years ago

I can confirm that my original issue (jdbc reporting INT sqltype 4) but read as long was occurring for me with 3.36.0.2 but it works now with 3.39.3.0. Thanks! I'll still have to keep the older version for my code coverage but nice work!

gotson commented 2 years ago

I can confirm that my original issue (jdbc reporting INT sqltype 4) but read as long was occurring for me with 3.36.0.2 but it works now with 3.39.3.0. Thanks! I'll still have to keep the older version for my code coverage but nice work!

Thanks a lot for confirming!

However when I select from a table containing a number which is outside the bounds of a 32 bit integer, column type is -5, but "signed" is always false, even for negative numbers!

Can you provide a sample test?

@DamjanJovanovic I will close the current issue as it's been resolved. Can you open a new issue regarding isSigned?