Open bweber opened 4 months ago
Is there something wrong with your hangfire instance? Does this exception affect how the jobs are processed?
Does it happen if you use just a connection string without connection factory?
Please attach all code with hangfire configuration. Now it is not clear how you pass connection factory to hangfire.
Example using connection factory can be found here: https://github.com/hangfire-postgres/Hangfire.PostgreSql/issues/322#issuecomment-1710694316
We are using a managed identity with Google Cloud, so we are registering our NpgSqlDatasource
like this:
services.AddSingleton(BuildGoogleDataSource(configuration));
private static NpgsqlDataSource BuildGoogleDataSource(IConfiguration configuration)
{
var credentials = GoogleCredential.GetApplicationDefault();
var scopedCredentials = credentials.CreateScoped("https://www.googleapis.com/auth/sqlservice.login");
var dataSourceBuilder = new NpgsqlDataSourceBuilder();
dataSourceBuilder.UsePeriodicPasswordProvider((_, cancellationToken) =>
new ValueTask<string>(scopedCredentials.UnderlyingCredential
.GetAccessTokenForRequestAsync(cancellationToken: cancellationToken)),
TimeSpan.FromMinutes(1), TimeSpan.FromSeconds(0));
dataSourceBuilder.ConnectionStringBuilder.ConnectionString = configuration.GetConnectionString("MyDatabase");
return dataSourceBuilder.Build();
}
Our Hangfire configuration is like this:
services.AddSingleton<IConnectionFactory, ConnectionFactory>();
services
.AddHangfire((sp, options) =>
{
options.UsePostgreSqlStorage(o => o.UseConnectionFactory(sp.GetRequiredService<IConnectionFactory>()),
new PostgreSqlStorageOptions { PrepareSchemaIfNecessary = false });
})
.AddHangfireServer(o => o.Queues = ["default"]);
The ConnectionFactory is in my original post.
Using the NpgSqlDatasource
is working perfectly in our Healthcheck and EntityFramework configuration. We are only seeing connection exceptions using this with Hangfire.
We are only seeing connection exceptions using this with Hangfire.
So what is the real problem you are trying to solve? Is hangfire not working for you? Also what severity level is your exception logged with? Is it error or warning?
This exception is coming from Distributed lock inside of recurring job scheduler. This might mean that you have 2 or more servers that try to perform scheduled operations, while only one manages to acquire the lock. This is not a bug – just normal operation.
Or else there has been some work in hangfire to enable parallel execution of recurring jobs even within one server. Potentially, this could lead to similar exceptions, but I don't have experience with that new feature yet.
The main problem we are running into is these are firing as unhandled exceptions and triggering alerts in our application monitoring services. We are seeing the same exception with other things like ServerHeartbeatProcess
as well.
It doesn't seem to be impacting Hangfire working as I see it handling background jobs and the dashboard works, but it seems to be some sort of connection timeout that isn't being gracefully handled, the subsequent usage of the connection throws an exception and then it tries to get a new connection, but since this is internal to Hangfire/Postgres library, my options there are limited to resolve it.
I could put some sort of filter in our logging config in our appsettings to downgrade this to a warning, but that may mask other issues in the future.
Thoughts?
Is there a recommended way to implement the ConnectionFactory. Here is what I have done:
Both ConnectionFactory and NpgsqlDataSource are registered as singletons:
services.AddSingleton<IConnectionFactory, ConnectionFactory>();
I am seeing a ton of connection exceptions in the logs:
This seems to happen after a couple of minutes.