testcontainers / testcontainers-dotnet

A library to support tests with throwaway instances of Docker containers for all compatible .NET Standard versions.
https://dotnet.testcontainers.org
MIT License
3.73k stars 262 forks source link

[Enhancement]: Allow users to override database name in the sql connection string #1029

Closed maxcalvin closed 10 months ago

maxcalvin commented 10 months ago

Problem

Essentially this is due to WithDatabase being private on the MsSql and SqlEdge packages. I'm not going to rehash what has already been said about this being an odd idea, but instead focus on a solution.

We have started looking into TestContainers for our unit tests, and have put together a scenario that we are planning to use. It goes a little something like this:

And then:

Solution

My proposal is to add an action to the GetConnectionString that allows you to override the hardcoded strings. At minimum the database (initial catalog), but might as well allow folk to set the user and password if they have some custom setup that adds them into a pre-built image.

I think that it might be desirable that each connection string may be configured differently in a way that is simple to understand and does not require regexing the connection string or using a connection string builder - both of which feel detached from the container.

I believe that this is a valid use of the MsSql and SqlEdge packages/helpers/wrappers, as the resulting images are still likely to be wanted to end up compatible - ie the same names for env variables.

var realConnectionString = container.GetConnectionString(opt => opt.Database = "MyAppData");

var realConnectionString2 = container.GetConnectionString(opt =>
    {
        opt.Database = "MyAppData";
        opt.Username = "someuser";
        opt.Password = "somepassword";
    });

If this (or any alternative) is acceptable, please let me know and I will put together a PR for your consideration!

Benefit

I guess the primary benefit is that people will have a more obvious way of connecting to the correct database. I cannot imagine that anyone using a (mssql/sqledge) database would go ahead and start modifying the master db, and indeed various packages and setups will blow up if you tried to do this.

It means less folk would come into the issues and request an enhancement or raise a bug about how to set the database name.

For reference, this was our experience of trying to set the database:

Alternatives

Current workaround:

There are many options for solutions to this but I think it boils down to a handful:

  1. Make WithDatabase public on MsSqlBuilder/SqlEdgeBuilder

    • I don't think this will get through PR for reasons already stated, so let's ignore it
    • I don't think the container object should care about the databases anyway
  2. Add something like UsingDatabase on MsSqlContainer/SqlEdgeBuilder

    • This would be a bit weird as I guess it would be chained like: container.UsingDatabase("MyAppData").GetConnectionString();
    • I don't think the container object should care about the databases anyway, and this seems like it would have to set it in order to chain to the connection string. There are ways around it sure, but it's not obvious

Would you like to help contributing this enhancement?

Yes

HofmeisterAn commented 10 months ago

As previously mentioned, I prefer to avoid providing connection strings for databases that do not exist. Your use case is slightly different, as you generate the database in advance, but others may not, and they could encounter issues. If necessary, you can always create a custom connection string for your needs.

Furthermore, I have suggested expanding the capabilities of the Testcontainers MSSQL module by overriding the entrypoint, allowing the container to execute custom scripts. This will enable us to upload and execute SQL scripts to create a database, user, and more, if they do not already exist, during the container startup process, and make the members WithDatabase(string), WithUsername(string) accessible to the public. This will significantly increase the value of the module and is beneficial for a wide range of use cases that developers can take advantage of.

However, I personally do not consider it a top priority. If someone wants to contribute and implement it properly into the module, I am happy to do the review.

This is essentially the idea. Instead of executing the default entry point, the Testcontainers MSSQL module runs a shell script that executes SQL scripts from /docker-entrypoint.d/ before indicating readiness of the container. This ensures that customizations can be executed in advance. Additionally, this approach has the advantage that developers can upload their own custom scripts to /docker-entrypoint.d/ (we can store the templates as embedded resources in the assembly):

protected override MsSqlBuilder Init()
{
    base.Init()
        .WithEntrypoint("/bin/sh", "-c")

        // The entry point script starts the SQL server and runs the SQL scripts from /docker-entrypoint.d/
        .WithCommand("entrypoint.sh")

        // Uploads the entry point script before the container starts
        .WithResourceMapping(Encoding.Default.GetBytes("#!/bin/bash ..."), "/entrypoint.sh");
}

public MsSqlBuilder WithDatabase(string database)
{
    return Merge(DockerResourceConfiguration, new MsSqlConfiguration(database: database))
        .WithEnvironment("SQLCMDDBNAME", database)

        // Uploads the SQL script to create a database before the container starts
        .WithResourceMapping(Encoding.Default.GetBytes("SELECT 1;"), "/docker-entrypoint.d/__01_create_database.sql");
}

public MsSqlBuilder WithUsername(string database)
{
    return Merge(DockerResourceConfiguration, new MsSqlConfiguration(username: username))
        .WithEnvironment("SQLCMDUSER", username)

        // Uploads the SQL script to create a user before the container starts
        .WithResourceMapping(Encoding.Default.GetBytes("SELECT 1;"), "/docker-entrypoint.d/__02_create_username.sql");
}
maxcalvin commented 10 months ago

I understand, and I did read the linked enhancement and the suggestion of the entry point implementation, but it wasn't something that I was planning to get into personally. As you say it's not a priority for you and it's not something we would use either - is it something that has a lot of requests?

There are a couple of points here that I might address though.

  1. If while adding the entry point code, you are happy adding the .WithDatabase and .WithUser - well we essentially end up in the scenario where now the user has the options that they were wanting (sort of). There's nothing linking the two With commands with the entry point setup (which is fine). On the plus side, it's a win for us as we can use the WithDatabase to set the name without using the entry point setup code too, but it also means the entry point code isn't necessary for the WithDatabase call to be public.
  2. I imagine that a very very (almost inevitably) common scenario is to run the container, get the master connection string, then send the commands to master to create the new database, maybe users with various permissions, etc, followed by a need to connect to that new database, perhaps as a DDL user to create schema, a writeable user to setup data, maybe a readonly user to verify data. So why not make it simple for the user to get those new connection strings with the appropriate data?
  3. The suggested entry point changes make it seem quite singular. There's no reason that the server has to be restricted to 'master+1' databases. I can imagine a company who makes a db per test class for better parallelisation of test runs. Or perhaps has multiple databases to set up and instead of one container per db there's nothing stopping them from using one server.

I understand that you don't want to allow a user to connect to a database that doesn't exist, but you can't know whether one exists or not without calling to the db which seems a bit overkill. If the user has specified a database they likely have a good reason and if they screw up, they'll get an appropriate error when trying to connect. It's more confusing to not be able to specify the database name (hence all these bug/enhance/issues) than to allow a probably competent developer to request a suitable connection string without jumping through more hoops or finding workarounds to this behaviour.

I like what you're doing here with TestContainers but it does have some design decisions that have me scratching my head, like this one (and the builder syntax instead of configuration actions/objects). We have a hacky workaround updating the connection string and we can live with that, but it would nice to get a cleaner way to work with the db connection string directly from the library.

HofmeisterAn commented 10 months ago

To modify the connection string, it only takes two lines. Using SqlConnectionStringBuilder, the connection string can be easily read, and individual values can be changed. Moreover, this has the advantage that all properties are available, not just a few selected ones. One can hardly call this a hacky workaround.

var connectionString = new SqlConnectionStringBuilder(_msSqlContainer.GetConnectionString());
connectionString.InitialCatalog = Guid.NewGuid().ToString("D");

An overloaded method GetConnectionString() does not really do anything different in the end. At least the use of SqlConnectionStringBuilder is explicit, and developers do not have to stumble upon the overloaded method that might not establish a connection as expected (because they did not create the database in advance). It is also not Testcontainers' responsibility to reimplement SqlConnectionStringBuilder.

Furthermore, we cannot access SqlConnectionStringBuilder in MsSqlContainer. We aim to minimize transitive dependencies in Testcontainers to reduce the chance of dependency clashes.

Running a Testcontainers module always provides a configured instance that is ready to accept client connections. Supporting WithDatabase and WithUsername directly via the builder has the advantage that the MSSQL module feels and behaves like all other modules.

Implementing the suggestion may require a bit more effort (by the way, it is Microsoft's recommended approach too), but it will feel much more integrated into Testcontainer's module ecosystem.

I understand that it is more than what you are asking for, but the Testcontainers community and MSSQL developers benefit significantly.

@michaelmcneilnet 👎 once again?

maxcalvin commented 10 months ago

Righto, well I have put my arguments out there but it's your repo and it doesn't seem like you will budge so there's not more to say :)

Cheers for the conversation. I still disagree but I don't want to take up more of your time with this topic.

HofmeisterAn commented 10 months ago

If you like, you can leave the issue open. Perhaps someone will pick it up and contribute in the future with an approach that aligns with the module's idea. By the way, this is not something new in Testcontainers. For instance, if you compare the efforts we put into spinning up a functional Couchbase instance that developers can use immediately, the adjustments I am requesting for the MSSQL module are quite minor.