Giorgi / DuckDB.NET

Bindings and ADO.NET Provider for DuckDB
https://duckdb.net
MIT License
338 stars 61 forks source link

Named parameters throw AccessViolationException when debugging #178

Closed ntsim closed 1 month ago

ntsim commented 3 months ago

When running a query like:

command.CommandText = "SELECT * FROM 'some.parquet' WHERE some_field IN ($p0, $p1)";
command.Parameters.Add(new DuckDBParameter("p1", things[0]));
command.Parameters.Add(new DuckDBParameter("p2", things[1]));

await using var reader = command.ExecuteReader();

When debugging in Rider (haven't tested in Visual Studio), an AccessViolationException seems to be consistently thrown upon executing the reader:

Fatal error. System.AccessViolationException: Attempted to read or write protected memory. This is often an indication that other memory is corrupt.
   at DuckDB.NET.Data.PreparedStatement.BindParameter(DuckDB.NET.DuckDBPreparedStatement, Int64, DuckDB.NET.Data.DuckDBParameter)
   at DuckDB.NET.Data.PreparedStatement.BindParameters(DuckDB.NET.DuckDBPreparedStatement, DuckDB.NET.Data.DuckDBParameterCollection)
   at DuckDB.NET.Data.PreparedStatement.Execute(DuckDB.NET.Data.DuckDBParameterCollection)
   at DuckDB.NET.Data.PreparedStatement.PrepareMultiple(DuckDB.NET.DuckDBNativeConnection, System.String, DuckDB.NET.Data.DuckDBParameterCollection)
   at DuckDB.NET.Data.DuckDbCommand.ExecuteDbDataReader(System.Data.CommandBehavior)
   at DuckDB.NET.Data.DuckDbCommand.ExecuteReader()

No breakpoint is necessary for this to happen.

Not sure if this is an environment specific issue or not, but it's a bit of a showstopper to using named parameters if debugging doesn't work.

Environment

Giorgi commented 3 months ago

Do you mean that the exception happens only during debugging but not when running without a debugger? I made a small change regarding named parameters in the 0.10.1.1 version, can you test with that one?

JMcDanielFbn commented 3 months ago

We are encountering something similar running on osx and linux both with 0.10.1.1 (it's fine with 0.10.1)

Which might have to do with the change you just mentioned, just executing the simplest of SQL:

SELECT 42 as A

    Method not found: 'DuckDB.NET.Native.DuckDBState PreparedStatements.DuckDBBindParameterIndex(DuckDB.NET.Native.DuckDBPreparedStatement, Int32 ByRef, DuckDB.NET.Native.SafeUnmanagedMemoryHandle)'.

-------------------
       at DuckDB.NET.Data.PreparedStatement.BindParameters(DuckDBPreparedStatement preparedStatement, DuckDBParameterCollection parameterCollection)
       at DuckDB.NET.Data.PreparedStatement.Execute(DuckDBParameterCollection parameterCollection)
       at DuckDB.NET.Data.PreparedStatement.PrepareMultiple(DuckDBNativeConnection connection, String query, DuckDBParameterCollection parameters)+MoveNext()
       at DuckDB.NET.Data.DuckDBDataReader.InitNextReader()
       at DuckDB.NET.Data.DuckDBDataReader..ctor(DuckDbCommand command, IEnumerable`1 queryResults, CommandBehavior behavior)
       at DuckDB.NET.Data.DuckDbCommand.ExecuteDbDataReader(CommandBehavior behavior)
       at System.Data.Common.DbCommand.ExecuteDbDataReaderAsync(CommandBehavior behavior, CancellationToken cancellationToken)
Giorgi commented 3 months ago

@JMcDanielFbn Looks like I messed up dependency numbers when publishing the NuGet packages. I re-uploaded 0.10.1.2 which should fix it.

ntsim commented 3 months ago

@Giorgi yes I meant that the exception was only being thrown when using the debugger. However, I've just tested with 0.10.1.2 and seems like everything is fine now!

Thank you for sorting this out :smile:

ntsim commented 3 months ago

Apologies @Giorgi, but I messed up my test. Looks like this is actually still happening :disappointed:

Could you re-open this?

Giorgi commented 3 months ago

I don't use either Ubuntu or Rider so there isn't much I can do to investigate this issue. I also find it strange that it happens only when debugging.

The only thing I can suggest is that you try to reproduce it in a test C application using the DuckDB C API and see if it happens there or not.

Giorgi commented 3 months ago

You could also try reproducing it on Windows.

Giorgi commented 3 months ago

Tried running it in WSL but couldn't reproduce.

JMcDanielFbn commented 3 months ago

Thanks for 0.10.1.2 @Giorgi - worked a treat for my issue.

If it helps I use OSX & Rider - and do not get that issue when debugging.

Any chance you have done a binary build of DuckDB @ntsim ? I found that when I did it took precedence over the packed-here binaries, which cost me a couple more hours than I would be proud to admit. In the end I did a brew uninstall duckdb ... and it was all fine.

Giorgi commented 2 months ago

@ntsim Did you have a chance to try the suggestions from the comments?

ntsim commented 2 months ago

@JMcDanielFbn @Giorgi Apologies for the delay getting back to you on this!

I had the duckdb CLI binary on my PATH, but I didn't expect this to affect anything. I tried moving it out of my PATH, but just got the same thing. To clarify - I'm using DuckDB.NET.Data.Full

I've done some further debugging and depending on how I arrange breakpoints and debug, I can get the following exception to throw on PreparedStatement.BindParameter:

System.NullReferenceException: Object reference not set to an instance of an object.
         at DuckDB.NET.Data.PreparedStatement.BindParameter(DuckDBPreparedStatement preparedStatement, Int64 index, DuckDBParameter parameter)
         at DuckDB.NET.Data.PreparedStatement.BindParameters(DuckDBPreparedStatement preparedStatement, DuckDBParameterCollection parameterCollection)
         at DuckDB.NET.Data.PreparedStatement.Execute(DuckDBParameterCollection parameterCollection)
         at DuckDB.NET.Data.PreparedStatement.PrepareMultiple(DuckDBNativeConnection connection, String query, DuckDBParameterCollection parameters)+MoveNext()
         at DuckDB.NET.Data.DuckDBDataReader.InitNextReader()
         at DuckDB.NET.Data.DuckDBDataReader..ctor(DuckDbCommand command, IEnumerable`1 queryResults, CommandBehavior behavior)
         at DuckDB.NET.Data.DuckDbCommand.ExecuteDbDataReader(CommandBehavior behavior)
         at System.Data.Common.DbCommand.ExecuteDbDataReaderAsync(CommandBehavior behavior, CancellationToken cancellationToken)

In other cases, it's also possible to set the breakpoints so that it just crashes out around PreparedStatement.BindParameters L106/107:

foreach (DuckDBParameter param in parameterCollection)
{
    var state = NativeMethods.PreparedStatements.DuckDBBindParameterIndex(preparedStatement, out var index, param.Paramete
    if (state.IsSuccess())
    {
        BindParameter(preparedStatement, index, param);
    }
}

Once state is resolved, the session will nearly immediately crash without any log output or an exception.

I'm not super sure what's going on, but feels like the issue is within this general region?

FWIW, I have managed to get around this issue by using question mark parameters (?) and extending / hacking the various DuckDB.NET classes to override the default parameter binding behaviour:

public class MyDuckDbConnection(string connectionString): DuckDBConnection(connectionString)
{
    public override MyDuckDbCommand CreateCommand()
    {
        // Bit rubbish to do this but we don't have access to the
        // underlying `Transaction` so we need to get a reference
        // to it through by creating a base command object first.
        var wrappedCommand = base.CreateCommand();

        return new MyDuckDbCommand
        {
            Connection = this,
            Transaction = wrappedCommand.Transaction
        };
    }
}

public class MyDuckDbCommand : DuckDbCommand
{
    protected override DbParameter CreateDbParameter() => new MyDuckDbParameter();
}

public class MyDuckDbParameter : DuckDBParameter
{
    public override string ParameterName
    {
        // This is the hack that allows things to work by 
        // preventing `ParameterName` being set...
        get => string.Empty;
        set => base.ParameterName = string.Empty;
    }
}

Not sure if this the best way of going about things, but it was the only way to get things working. Ideally, the named parameters would work correctly as they've obviously got advantages over question mark parameters.

Giorgi commented 2 months ago

Do you get these exceptions only if debugging? Named parameters do work on all platforms when the tests run on GitHub Actions.

If you can share a reproducible code I'll try to see what's going on but I also suspect it has something to do with your environment.

ntsim commented 2 months ago

Yes, these exceptions only occur whilst debugging with named parameters. They don't occur when debugging with the question mark parameter workaround outlined above.

I've also been using this with Dapper, but don't really think this is the cause as I think I was having these issues with the data reader API too.

I'll try and put something together for you when I get the time.

Giorgi commented 2 months ago

You don't really need that workaround, "question mark" parameters are supported out of the box: https://duckdb.net/docs/basic-usage.html#parameterized-statements

Can you try if you get the exception on other platforms too?

ntsim commented 2 months ago

Ah yes, I meant I need this workaround to use Dapper as it only supports named parameters, generating them with names like p0, p1, etc. This workaround essentially prevents named parameters being used and forces the usage of question mark parameters instead.

Sure, I'll try and check it out in Windows as well!

Giorgi commented 2 months ago

@ntsim Dapper supports so-called "pseudo-positional parameters". See if it helps in your case: https://github.com/DapperLib/Dapper/pull/1952#issuecomment-1690539542

Giorgi commented 1 month ago

Closing due to inactivity.

andriibratanin commented 6 days ago

Hey, @Giorgi

I want to share my findings on this type of errors.

The debugging session may throw one of two following exceptions:

System.AccessViolationException: Attempted to read or write protected memory. This is often an indication that other memory is corrupt.

or

System.NullReferenceException: Object reference not set to an instance of an object.
   at DuckDB.NET.Data.Internal.PreparedStatement.BindParameter(DuckDBPreparedStatement preparedStatement, Int64 index, DuckDBParameter parameter)
   at DuckDB.NET.Data.Internal.PreparedStatement.BindParameters(DuckDBPreparedStatement preparedStatement, DuckDBParameterCollection parameterCollection)
   at DuckDB.NET.Data.Internal.PreparedStatement.Execute(DuckDBParameterCollection parameterCollection, Boolean useStreamingMode)
   at DuckDB.NET.Data.Internal.PreparedStatement.PrepareMultiple(DuckDBNativeConnection connection, String query, DuckDBParameterCollection parameters, Boolean useStreamingMode)+MoveNext()
   at DuckDB.NET.Data.DuckDBDataReader.InitNextReader()
   at DuckDB.NET.Data.DuckDBDataReader..ctor(DuckDBCommand command, IEnumerable`1 queryResults, CommandBehavior behavior)
   at DuckDB.NET.Data.DuckDBCommand.ExecuteDbDataReader(CommandBehavior behavior)
   at DuckDB.NET.Data.DuckDBCommand.ExecuteReader()
   at DuckDB.NET.Data.DuckDBCommand.ExecuteScalar()
   at DuckDbTests.UnitTest1.Test1() in C:\MyProjects\DuckDbExperiments\DuckDbTests\UnitTest1.cs:line 18
   at System.RuntimeMethodHandle.InvokeMethod(Object target, Void** arguments, Signature sig, Boolean isConstructor)
   at System.Reflection.MethodBaseInvoker.InvokeWithNoArgs(Object obj, BindingFlags invokeAttr)

Facts:

Possible cause: I suppose it is related to how debugger handles marshalling.

Example place where it fails is DuckDB.NET.Data.Internal.PreparedStatements class, BindParameters method: https://github.com/Giorgi/DuckDB.NET/blob/9c0dbf81801ae584a32c010f5a372846f3a72618/DuckDB.NET.Data/Internal/PreparedStatement.cs#L108
There is a call to:

[DllImport(DuckDbLibrary, CallingConvention = CallingConvention.Cdecl, EntryPoint = "duckdb_bind_parameter_index")]
public static extern DuckDBState DuckDBBindParameterIndex(DuckDBPreparedStatement preparedStatement, out int index, SafeUnmanagedMemoryHandle name);

after which the param cycle variable is messed up and debugger reports its value as "not an object".

Workaround:
Go to Rider's Settings Settings -> Build, Execution, Deployment -> Debugger, find the JIT (excluding Mono) section and uncheck the "Disable JIT optimization on module load" option - after this debugging of DuckDB.NET related code turns possible :)

As you see, the issue is not in your code, but in debugger optimizations.
Thank you for your effort!

Giorgi commented 5 days ago

@andriibratanin Thanks for the detailed investigation and for providing a workaround. Did you report the issue to JetBrains?

andriibratanin commented 4 days ago

@Giorgi, I've just created it in their YouTrack system, but it can take ages to fix... And, BTW, I was able to reproduce the issue in Visual Studio too.

What I'd recommend you to do is place the related information on this issue on some page so it can no longer be a "show stopper"/"deal breaker" for those people trying DuckDb.NET and building PoCs. DuckDb is a rising star and you've done a lot of work ;)