hangfire-postgres / Hangfire.PostgreSql

PostgreSql Storage Provider for Hangfire
Other
363 stars 132 forks source link

System.InvalidCastException when querying for postgis Point if Hangfire is being setup before the application context #322

Open Cherepoc opened 1 year ago

Cherepoc commented 1 year ago

Hi! I have a console application "worker" that performs background tasks using Hangfire with postgresql storage. I have postgis extension installed and some of my entities have properties with the type Point. I configure both the database context and hangfire through DI services. Some time ago I found this error that was frequently happening in worker: System.InvalidCastException: Can't cast database type . to Point It turned out that if a hangfire service like IRecurringJobManager is requested before the DbContext then anytime you want to load an entity with a Point property, it fails, but if the DbContext is requested before the IRecurringJobManager, then everything works fine. Here's an example project. You'll get the same error if you run it (you should have a postgres database of course and a correct connection string). In Startup method of the Program class if you move getting IRecurringJobManager to the bottom, then you won't get any error, even if request a new ApplicationDbContext again after it. NpgSqlBugDemo.zip

azygis commented 1 year ago

This looks like the same issue we had a while ago, at least similar.

Hangfire.PostgreSql does not change any type mappings so I'm exactly sure what to make of it.

Please see the steps in this comment. Hopefully @dmitry-vychikov is able to help as I personally don't have time in the near future for debugging this. First step is changing the order and it seems to work, so it's interesting to see if the type mappings are registered correctly which you can see while debugging.

dmitry-vychikov commented 1 year ago

Hi @Cherepoc

I'm on vacation now and not able to look at the zip file yet.

It turned out that if a hangfire service like IRecurringJobManager is requested before the DbContext then anytime you want to load an entity with a Point property, it fails, but if the DbContext is requested before the IRecurringJobManager, then everything works fine.

It feels like Point registration is done lazily in DbContext DI handler. Therefore, Hangfire is unable to map Points until you request the DbContext.

This doesn't seem like a hangfire issue on its own, but it creates confusion. NpgsqlConnection.GlobalTypeMapper static object is the main culprit.

Workaround @Cherepoc

If you need a workaround, then you can:

Improvements in hangfire Npgsql introduced new NpgsqlDataSource (https://www.npgsql.org/doc/types/enums_and_composites.html?tabs=datasource) approach to move away from static objects. @azygis I think we can do two things in hangfire:

1) Migrate to NpgsqlDataSource to isolate hangfires db connection from the rest of the application, and provide levers for configuration of postgres extension types.

2) Update the docs to explain how hangfire works with postgres extensions.

Anyway, I will get back from vacation in 1-2 weeks, double check the repro to confirm my thoughts, and reply here.

azygis commented 1 year ago

Thanks for taking the time replying during the vacation!

NpgsqlDataSource is tricky. There is no DbDataSource in pre-net7.0, similar to DbBatch which I wanted to use in a few places while targeting net5.0. Abstraction is in netstandard, but the implementation is (and forever will be) missing in earlier versions of dotnet. This means we cannot fully migrate to NpgsqlDataSource until November 12, 2024, which is .NET 6 EOL, as .NET 7 is the first version with DbDataSource.

We probably can add a wrapper factory which either uses the data source or connection string directly, but it feels cumbersome and still just a band-aid.

Another option is to move to a different versioning scheme and have multiple "active" package versions.

Cherepoc commented 1 year ago

@dmitry-vychikov Thanks! I've also thought about manually requesting dbinstance too, but it turned out it does not work either. In my worker app this leads to a Dapper error being thrown by the Hangfire: Error parsing column 3 (FetchedAt=2023-09-05T11:51:04Z - Object) at Dapper.SqlMapper.ThrowDataException(Exception ex, Int32 index, IDataReader reader, Object value) in /_/Dapper/SqlMapper.cs:line 3706

Unfortunately it does not specify what exactly it tries to convert, but after some experimentation I managed to repeat the error in the example app under some specific conditions:

Here's the exception: Execution loop Worker:5bee904a caught an exception and will be retried in 00:00:16 System.Data.DataException: Error parsing column 3 (FetchedAt=2023-09-05T11:58:41Z - Object) ---> System.InvalidCastException: Unable to cast object of type 'NodaTime.Instant' to type 'System.Nullable1[System.DateTime]'. at Deserialize74200177-2080-4fb6-a5a9-3ae546fad977(IDataReader) --- End of inner exception stack trace --- at Dapper.SqlMapper.ThrowDataException(Exception ex, Int32 index, IDataReader reader, Object value) in /_/Dapper/SqlMapper.cs:line 3706 at Deserialize74200177-2080-4fb6-a5a9-3ae546fad977(IDataReader) at Dapper.SqlMapper.QueryImpl[T](IDbConnection cnn, CommandDefinition command, Type effectiveType)+MoveNext() in /_/Dapper/SqlMapper.cs:line 1113 at System.Collections.Generic.List1..ctor(IEnumerable1 collection) at System.Linq.Enumerable.ToList[TSource](IEnumerable1 source) at Dapper.SqlMapper.Query[T](IDbConnection cnn, String sql, Object param, IDbTransaction transaction, Boolean buffered, Nullable1 commandTimeout, Nullable1 commandType) in /_/Dapper/SqlMapper.cs:line 734 at Hangfire.PostgreSql.PostgreSqlJobQueue.<>cDisplayClass13_0.b0() at Hangfire.PostgreSql.Utils.Utils.TryExecute[T](Func1 func, T& result, Func2 swallowException, Nullable1 tryCount) at Hangfire.PostgreSql.PostgreSqlJobQueue.Dequeue_Transaction(String[] queues, CancellationToken cancellationToken) at Hangfire.PostgreSql.PostgreSqlJobQueue.Dequeue(String[] queues, CancellationToken cancellationToken) at Hangfire.PostgreSql.PostgreSqlConnection.FetchNextJob(String[] queues, CancellationToken cancellationToken) at Hangfire.Server.Worker.Execute(BackgroundProcessContext context) at Hangfire.Server.BackgroundProcessDispatcherBuilder.ExecuteProcess(Guid executionId, Object state) at Hangfire.Processing.BackgroundExecution.Run(Action2 callback, Object state)

Here's an updated example app NpgSqlBugDemo-2.zip

dmitry-vychikov commented 1 year ago

@Cherepoc

I've also thought about manually requesting dbinstance too, but it turned out it does not work either. In my worker app this leads to a Dapper error being thrown by the Hangfire:

I do not believe this is related to the original problem. This error is caused by usage of NodaTime which seems to be not natively supported by Dapper. You can check this thread https://github.com/DapperLib/Dapper/issues/198. Maybe some workaround proposed there can help. Or disable NodaTime completely.

Explanation

NpgsqlDataSource mentioned above should fix the problem. I think, we don't really want to run NodaTime with Hangfire. But as @azygis mentioned, it is not straightforward because of strange Microsoft's versioning schemes :(

dmitry-vychikov commented 1 year ago

@Cherepoc Here's a fix that allows to keep NodaTime.

1) Define new ConnectionFactory for Hangfire which uses aforementioned NpgsqlDataSource

class MyHangfireConnectionFactory : IConnectionFactory
{
    private readonly NpgsqlDataSource _dataSource;

    public MyHangfireConnectionFactory(NpgsqlDataSource dataSource)
    {
        _dataSource = dataSource;
    }

    public NpgsqlConnection GetOrCreateConnection()
    {
        return _dataSource.CreateConnection();
    }
}

2) Then in AddOmHangfire create datasource and the factory, and pass them to UsePostgreSqlStorage

 public static IServiceCollection AddOmHangfire(this IServiceCollection services, string connectionString)
    {
        //NOTE: create factory outside of AddHangfire lambda. 
        // The datasource MUST be created before `NpgsqlConnection.GlobalTypeMapper` static object is polluted by `o.UseNodaTime` and similar calls 
       // because datasource inherits global mappings at creation time.
        var factory = new MyHangfireConnectionFactory(NpgsqlDataSource.Create(connectionString));
        services.AddHangfire(
            config =>
            {
                var options = new PostgreSqlStorageOptions(...);
                config.UsePostgreSqlStorage(factory, options);
               ...
            }
        );

I briefly tested this with your example, and it does not give me any errors after applying the fix.

Thanks to @azygis for mentioning the wrapper factory and giving me a large hint :)

To @azygis

We probably can add a wrapper factory which either uses the data source or connection string directly, but it feels cumbersome and still just a band-aid.

Well, Hangfire is dependent on 3rd party libs, so we have to account for their design flaws. NpgsqlConnection.GlobalTypeMapper is just one one them.

It is a band aid, but only until we can reference NpgsqlDataSource directly in Hangfire code. I think it should eventually replace the constructor with plain connectionString parameter.

dmitry-vychikov commented 1 year ago
image

By the way, all the supported options for configuring the connection seem redundant and very difficult to maintain because of if-else branching. I would suggest to keep only IConnectionFactory abstraction. For backwards compatibility, all the options like _existingConnection or _connectionSetup could be encapsulated into separate factories.

Another option is to move to a different versioning scheme and have multiple "active" package versions.

Not sure what exactly you meant by different versioning scheme, but here's the third option:

There will be two challenges though:

azygis commented 1 year ago

First of all, you're a legend @dmitry-vychikov!

Not sure what exactly you meant by different versioning scheme

I meant having multiple versions, v1 and v2 and maintaining them both, backporting changes from v2 into v1 for those unfortunate not being able to use data source, at least the changes that are possible.

Edit to say that I actually forgot about conditional compilation. But unfortunately that still requires hard Npgsql 7.0 dependency, which as you mentioned as well is a breaking change. Unless we go reflection all the way with lowest 6, but we might still end up with wrong assumptions between 6 and 7, throwing random stuff at runtime.

Cherepoc commented 1 year ago

@dmitry-vychikov Thanks! Your solution works.

azygis commented 1 year ago

Took the advice of using the IConnectionFactory for the bootstrapping and obsoleted the other ways in #326. I might be able to release v2 similar to how we worked with Npgsql 5 while we supported both that and a newer version. With the current state of NuGet we can't have a single package which depends on (possibly) multiple versions, unfortunately, so the new package name is inevitable. I hope there's a way to at least stay on the same branch - we'll see.

dmitry-vychikov commented 1 year ago

With the current state of NuGet we can't have a single package which depends on (possibly) multiple versions, unfortunately

I have a feeling that we actually can do it. If we add multiple target frameworks to the package, and then use condition in csproj file to include either Npgsql 6 or 7 depending on target framework used. I'm not 100% sure, but I can try it later today and submit a PR.

Another maybe simpler option is to create a brand new package Hangfire.Postgres.Extensions.Npgsql7 that will contain:

azygis commented 1 year ago

We can't force people to upgrade to Npgsql7 if they're on .NET 7 and still use Npgsql6. Npgsql works with virtually any .net version and we shouldn't put such a hard lock either.

Now the second suggestion I like - a separate library which drags a later dependency. Really less confusing than the previous approach. Will probably go that route.

Edit to add: we could probably still do that with current version. Obsolete members will still be there for the next version(s), but we can already support the data source.

co-dax commented 8 months ago

@dmitry-vychikov thanks for your workaround from above, it works. By the way, I found another workaround that is working as well and may also be starting point of a proper fix/patch. https://github.com/hangfire-postgres/Hangfire.PostgreSql/issues/162#issuecomment-671044622.

@azygis