Giorgi / DuckDB.NET

Bindings and ADO.NET Provider for DuckDB
https://duckdb.net
MIT License
411 stars 69 forks source link

DuckDBTimeOnly.Msec can overflow DateTime #89

Closed acinep closed 1 year ago

acinep commented 1 year ago

When DuckDBDataReader.GetDateTime is called to process a DuckDBResult, it parses the value as a DuckDBTimestampStruct:

    public override DateTime GetDateTime(int ordinal)
    {
        var timestampStruct = NativeMethods.Types.DuckDBValueTimestamp(queryResult, ordinal, currentRow);
        var timestamp = NativeMethods.DateTime.DuckDBFromTimestamp(timestampStruct);
        return timestamp.ToDateTime();
     }

However, the Msec field can hold a greater precision than the DateTime clr type - and will throw ArgumentOutOfRangeException in DuckDBTimestamp.ToDateTime(). DateTime type limits the milliseconds to 0-999 whereas the DuckDB struct can run into the microseconds.

Truncating the microseconds works, and is easy enough - but wanted to raise it as an issue for discussion as to the appropriate solution if MSec > 999.

Giorgi commented 1 year ago

It's the same issue as https://github.com/Giorgi/DuckDB.NET/issues/74 right?

acinep commented 1 year ago

Yes - no idea how I missed that…sorry!

Happy to do a PR if you have any thoughts/preferences on the options above

Giorgi commented 1 year ago

@acinep I think throwing an exception is best.

acinep commented 1 year ago

Hi Giorgi,

I think there is a more fundamental issue here. The DuckDB timestamp structs hold microseconds (1/1000th of a millisecond). The DuckDB.NET bindings treat this int value as milliseconds, so if there are more than 999 microseconds in the timestamp it throws.

This means it cannot handle any duckdb_time with > 999 microseconds (which includes all my parquet files!). A simpler example - if you round trip a System.DateTime with >0 milliseconds using Parquet.Net then it will fail.

You can see the underlying issue in TimeTests:QueryScalarTest. The command text formats milliseconds as microseconds:

cmd.CommandText = $"SELECT TIME '{hour}:{minute}:{second}.{millisecond:000000}';";

...and therefore gets away with it. If the milliseconds are formatted correctly {millisecond:000} then the test fails with ArgumentOutOfRangeException.

Giorgi commented 1 year ago

Do you mean that there is a bug in converting from duckdb timestamp to DateTime?

acinep commented 1 year ago

Yes. e.g.

public DateTime ToDateTime()
{
    var date = DuckDBDateOnly.MinValue;
    return new DateTime(date.Year, date.Month, date.Day, Hour, Min, Sec, Msec);
}

MSec is in microseconds, not milliseconds. I will send a PR.

Giorgi commented 1 year ago

I saw that too. I think there are similar issues in other places too. Can you join DuckDB Discord? We can discuss these bugs in the dotnet channel there.