dotnet / efcore

EF Core is a modern object-database mapper for .NET. It supports LINQ queries, change tracking, updates, and schema migrations.
https://docs.microsoft.com/ef/
MIT License
13.78k stars 3.19k forks source link

Migrations Bundles #19693

Closed bricelam closed 3 years ago

bricelam commented 4 years ago

We currently have dotnet ef migrations script and database update to apply migrations. Unfortunately, they each have their shortcomings.

migrations script is pretty good at producing SQL scripts (there are some known issues with the scripts), but figuring out what to do with those scripts may not be obvious. Publish from Visual Studio wraps these scripts up in a web deployment package and uses MSDeploy to run them on the server. This works well for SQL Server on Windows, but nothing else.

database update is a lot better at applying migrations. It doesn't have any of the issues the SQL script does. Unfortunately, it requires the .NET Core SDK, your app's source code, and a direct connection to the production database. Having all of these things available where you need them is't always possible.

I propose we introduce a new migrations bundle command. This would create a self-contained EXE that you could use to apply migrations. It would behave the same as database update and let you specify the connection string. This would make SSH/Docker/PowerShell deployment a lot easier. You wouldn't need to copy your source files to a server inside the firewall and you wouldn't need to install the .NET Core SDK. Here is an example of what it might look like to use SSH:

dotnet ef migrations bundle
scp bundle me@myhost.com:./
ssh me@myhost.com "./bundle --connectionString ${DEPLOYMENT_CONNECTION_STRING}"
ssh me@myhost.com "rm bundle"

Taking this further, Visual Studio could do all of this for you like they do today on MSDeploy but when you're deploying to Docker. They could even create a new SSH deployment experience.


TODO

julielerman commented 3 years ago

Yay @ryanrichard19 for uncovering it! :)

marcusturewicz commented 3 years ago

I may have missed this, but how do migrations bundles compare with idempotent sql scripts? For example, if I currently have this:

dotnet ef migrations script -i -p src -o publish-artifacts/sql/script.sql

Then deploy the script with Azure/sql-deploy action.

If I change this to:

dotnet ef migrations bundle -p src
./bundle --connection [my-connection-string]

will it still work in the same way?

julielerman commented 3 years ago

I had the same question and just tested it out last night since you can't see what's in the bundle. (Hmm I think we should have a way to verify bundle contents ... black box can be scary)

It seems to contain all of the migrations and runs just the needed ones.

For my test, I had two migrations and built two bundles: one after creating the first migration and another after creating the second. I ran the first with no db existing and it created the db and ran the single migration . I ran the second and it ran only the second migration. Then I deleted the db and ran the second and it created the db and ran both scripts.

julielerman commented 3 years ago

@bricelam Shall I create an issue requesting a way to discover what migrations are included in a particular bundle?

zeshan321 commented 3 years ago

Is there a way to revert migrations with the bundle?

jaliyaudagedara commented 3 years ago

What do you guys think of adding an argument to support running the migrations in a transaction? I think that's a huge pain point (at least for me, I believe there are a lot of other devs as well).

I just tried 2 migrations, the first one should pass, and the second one introduces a unique constraint that should fail on purpose. When I run the bundle.exe, it throws an error (as expected), but the first migration is applied. This causes the database to be in an inconsistent state.

Most of the time I use migrations scripts and in the pipeline, I am manually wrapping the script execution in a transaction. So either all gets executed or non gets executed.

What do you guys think?

roji commented 3 years ago

@jaliyaudagedara see #22616 for relevant discussion, though that issue discusses both script and non-script ways of applying migrations.

jaliyaudagedara commented 3 years ago

Thanks @roji!

bricelam commented 3 years ago

Shall I create an issue requesting a way to discover what migrations are included in a particular bundle?

Today the bundle is effectively just dotnet ef database update, but I've been thinking it would be helpful to include gestures for these commands too in the bundle:

Filed #25872

ajcvickers commented 3 years ago

@bricelam Yes!

ivanmonteiro commented 3 years ago

I might be missing something, but I tried to create the bundle inside a (linux) docker container.

The bundle file is generated but I cannot run it.

This does not seem to be a permission problem, because the user is root and the bunde has permission:

/app # whoami
root
/app # ls -lah /app/
-rwxr--r--    1 root     root        3.6M Apr 20 10:26 Swashbuckle.AspNetCore.SwaggerUI.dll
-rwxr-xr-x    1 root     root        1.1K Aug 26 23:45 appsettings.Development.json
-rwxr-xr-x    1 root     root        1.1K Aug 26 19:18 appsettings.Production.json
-rwxr-xr-x    1 root     root        1.3K Aug 26 20:53 appsettings.json
-rwxr-xr-x    1 root     root       33.7M Sep  7 23:07 bundle

I tried to run the bundle in two ways: With ./bundle returns:

./bundle
sh: ./bundle: not found

and with the command dotnet bundle returns:

# dotnet bundle
Could not execute because the application was not found or a compatible .NET SDK is not installed.
Possible reasons for this include:
  * You intended to execute a .NET program:
      The application 'bundle' does not exist.
  * You intended to execute a .NET SDK command:
      It was not possible to find any installed .NET SDKs.
      Install a .NET SDK from:
        https://aka.ms/dotnet-download

Note that I'm using the mcr.microsoft.com/dotnet/sdk:6.0-focal image as the builder image to build the application and the bundle. And mcr.microsoft.com/dotnet/aspnet:6.0-alpine to run the application and to execute the bundle file. Here is my Dockerfile:

FROM mcr.microsoft.com/dotnet/sdk:6.0-focal AS build

# Add dotnet-ef tool to create the migrations bundle
RUN dotnet tool install --global dotnet-ef --version 6.0.0-preview.7.21378.4
ENV PATH="${PATH}:/root/.dotnet/tools"

# Restore dependencies of .net core projects
COPY src/*/*.csproj ./src/
COPY Directory.Build.targets Directory.Build.targets
RUN for file in $(ls src/*.csproj); do mkdir -p ${file%.*} && mv $file ${file%.*}; done
RUN dotnet restore "src/BackendApp/BackendApp.csproj"

# Copy everything else and build app
COPY . ./
WORKDIR src/BackendApp
RUN dotnet publish "BackendApp.csproj" -c Release -o /app/out/
RUN dotnet ef migrations bundle --verbose && mv bundle /app/out/

# Final stage using dotnet runtime (alpine)
FROM mcr.microsoft.com/dotnet/aspnet:6.0-alpine AS runtime
WORKDIR /app
EXPOSE 80
ENV PORT=80
COPY --from=build /app/out .

ENV ASPNETCORE_ENVIRONMENT=Production

CMD ASPNETCORE_URLS=http://*:$PORT dotnet backendapp.dll

I wonder if I'm missing something 🤔

bricelam commented 3 years ago

@ivanmonteiro Can you try again on RC1? If it's still an issue, please submit a new issue.

anuraj commented 3 years ago

I am facing the same issue with 6.0.0-rtm.21519.8. Here is the log - error MSB3021: Unable to copy file "obj\Debug\net6.0\demo2.dll" to "bin\Debug\net6.0\demo2.dll". The process cannot access the file 'C:\demo2\bin\Debug\net6.0\demo2.dll' because it is being used by another process.

I don't have VS Code is running and C# extension is disabled in VS Code.

M2xAy commented 2 years ago

Hi Community We have also this error because it is being used by another process. on bundleing on ef core 6

Any updates ? Best regards from zurich

meronz commented 2 years ago

@ivanmonteiro

I wonder if I'm missing something thinking

RUN apk add --no-cache gcompat

Works for me.

ivanmonteiro commented 2 years ago

@ivanmonteiro

I wonder if I'm missing something thinking

RUN apk add --no-cache gcompat

Works for me.

I ended up using the alpine docker image in both build stage and the final stage of the Dockerfile and that solved the ./bundle: not found problem for me.

pinkfloydx33 commented 2 years ago

According to the documentation, it looks like you can pass the target migration name to the bundle. Can this be used to downgrade the migrations?

In other words, suppose we deploy our database and application in the same pipeline. We deploy the database successfully but the application deployment fails for whatever reason. Could we use the same bundle to undo the migrations (for ex. in an on-failure step) assuming we know the previous migration name? If yes, is there a way to find out what the most recent migration was so that we know what to roll back to (either using bundle or direct sql)?

sabbadino commented 2 years ago

why when i run "dotnet ef migrations bundle" , it tries to connect to the db using appsettings cnstring ? (and i get a connection error) .. I expected that this command would just bundle migrations . My azure devops build server has no access to any db

bricelam commented 2 years ago

@sabbadino It shouldn't connect to the database. Do you have something in your Program.CreateHostBuilder (or Main for minimal API projects) or Startup.ConfigureServices that connects to the database?

BenjaminAbt commented 2 years ago

Any hint (even if not supported today) how to use bundles with Azure DevOps and Managed Identity on Azure SQL? Username:Password in Connectionstring is meh.

Or fallback to ef migration script (or dacpac) and Azure SQL Deployment Task with ServicePrincipal, if we wanna have secure deployments?

ErikEJ commented 2 years ago

@BenjaminAbt There are various options: https://docs.microsoft.com/en-us/sql/connect/ado-net/sql/azure-active-directory-authentication?view=sql-server-ver15 - but not sure I understand your exact requirement - what does the managed identity represent?

pinkfloydx33 commented 2 years ago

You need to make sure your deployment agent VM has an MI/SP assigned to it. Then create a contained user in your azure sql db. Then for a connection string, you don't need user/password just Authentication=Active Directory Default. Nothing special other than that and it should just work. I do similar for both Postgres and Azure SQL

BenjaminAbt commented 2 years ago

Thanks for your replies.

Then for a connection string, you don't need user/password just Authentication=Active Directory Default

I am doing that. I was afraid that this would not work conceptually - but then there seems to be a bug in my setup. Thank you very much for the feedback.

ErikEJ commented 2 years ago

@BenjaminAbt for Active Directory Defsult make sure to use SqlClient 3.0 or later

sabbadino commented 2 years ago

@bricelam ,sure i have , it's my app that does to connect to the db, I really don't get the question ? Am i missing something ? Is this bundle supposed to generate the scripts to be executed at build time ? I expected the bundle would bring all migration in it and evaluate which ones to run at deploy time ..

pinkfloydx33 commented 2 years ago

Thanks for your replies.

Then for a connection string, you don't need user/password just Authentication=Active Directory Default

I am doing that. I was afraid that this would not work conceptually - but then there seems to be a bug in my setup. Thank you very much for the feedback.

If you are using Azure devops, you may also want to try running the Azure CLI task and using the service principal assigned to that, executing the bundle as the SP. Just make sure that SP has its own "contained user" in your db. That should also work. If you're using am older version of Ms.Data.SqlClient you may need to use "Active Directory Managed Identity" or one of the other flavors, though they require the appid for the user name

bricelam commented 2 years ago

@sabbadino bundles currently work the same way that dotnet ef database update does. It needs to build your application service provider in order to discover and instantiate the DbContext. It does this by calling Program.CreateHostBuilder. So every time you run the bundle, that method is executed.

sabbadino commented 2 years ago

So in opinion that is totally useless.. when i build i want to make a single bundle that will evaluate what updates to apply at deploy time

If i got it correctly, the way it works is 1) i should make a bundle for each db i want to target (cicd, test, prd) 2) build agent must reach any of the db of the different environments.

Not a real world situation

pinkfloydx33 commented 2 years ago

It builds an Exe that you pass a connection string to and it evaluates what migrations are missing and applies them. It can also rollback or only migrate to a specific migration (via commandline).

I build a bundle in CI and attach the single executable as an artifact which I then use during CD without having to build anything again. Not sure what you're talking about. They work as you expect them to

julielerman commented 2 years ago

Also, FWIW, the migration file is idempotent...it includes all the migrations and then based on that evaluation @pinkfloydx33 explained, executes what's needed

ivanmonteiro commented 2 years ago
  1. i should make a bundle for each db i want to target (cicd, test, prd)

No. Ideally you build the bundle only once. Then you can pass to the bundle.exe a connection string or it gets the connection string from the appsettings.{Environment}.json file.

  1. build agent must reach any of the db of the different environments.

Also, from my experience with the migrations bundle, when building the migrations bundle no connection to the database is needed in order to create the bundle.exe. Now, the bundle.exe needs to connect to the database in order to apply the migrations, of course. As mentioned by @julielerman you can generate a migration script (dotnet ef migrations script --idempotent) instead of the bundle if it suits your needs and just run that sql file in ci/cd or manually.

@sabbadino Here is the documentation on how to apply migrations: https://docs.microsoft.com/en-us/ef/core/managing-schemas/migrations/applying?tabs=dotnet-core-cli

sabbadino commented 2 years ago

Hi, i Will double check .. if the problem shows up actually, i will post the call stack

chinthakaherath commented 2 years ago

Is there any guide or document on how to run efbundle.exe in the azure release pipeline?

SirMrJames commented 2 years ago

Is there any guide or document on how to run efbundle.exe in the azure release pipeline?

I'm having the same issue, did you ever figure that out?

ErikEJ commented 2 years ago

Just run a script with the command line in your yaml pipeline:

- pwsh:  .\efbundle.exe --connection 'Data Source=(local)\MSSQLSERVER;Initial Catalog=Blogging;User ID=myUsername;Password=myPassword'
  failOnStderr:  true
SirMrJames commented 2 years ago

Just run a script with the command line in your yaml pipeline:

- pwsh:  .\efbundle.exe --connection 'Data Source=(local)\MSSQLSERVER;Initial Catalog=Blogging;User ID=myUsername;Password=myPassword'
  failOnStderr:  true

Perhaps I'm doing it wrong. I've built an artifact for the efbundle, and whenever I try to run it in a release pipeline using powershell nothing happens. No error, but no migrations are applied.

Interestingly, if I don't publish it as an artifact, and instead run it in the pipeline that I create the bundle it works fine. So I suppose the issue is with the artifact. But I'm not sure.

fredriksjoholm commented 2 years ago

Is there any way to run bundles in a distributed settings while preventing two bundles from applying changes to the database simultaneously? What I'm trying to do is to apply my migrations from an init container in a Kubernetes cluster. I've looked into using Kubernetes jobs in order to ensure that each bundle only runs once, but this approach seems to have its own drawbacks as well.

roji commented 2 years ago

@fredriksjoholm preventing concurrent migrations isn't something that EF does - you'll have to take care of it outside, where you're applying the migrations (e.g. in Kubernetes). Note that in general, applying migrations is seen as part of deploying a new version of your application; this is already a process that should happen once and in a controlled fashion (you don't want to have two different application deployments racing concurrently!).

For context, EF could try to make sure that migrations aren't executed concurrently by e.g. taking a lock on the database's migrations history table. The main problem with that is that migrations may involve multiple transactions (e.g. since SQL Server doesn't support various DDL statements in migrations), and so the lock would be released in the middle of the transaction. But once again, even regardless of the technical difficulties, this is something that should be handled at a higher level, in your deployment pipeline.

phillebaba commented 2 years ago

Maybe I can add some more context to the question. How would one solve this type of migration when applications are deployed with a tool like FluxCD or ArgoCD. In these situations there is not "CD Pipeline" to speak of that could run the DB migration before the actual applications are deployed. Instead new manifests are applied to the cluster as new commits are reconciled. In these situation the only real option is to run the migration inside of the cluster.

Even in a situation where a migration was done in a CD pipeline there are no guarantees that the CD pipeline is going to run once. A couple of providers will allow triggering of multiple pipelines for the same commit, and most others will default to running the same pipeline in parallel for different commits. So it seems that the risk concurrency risk is present no mater where it is run.

My guess is that there are a lot of EF users out there who are in a similar situation, deploying their .NET applications with a GitOps tool into a Kubernetes cluster. Right now if you google "kuberentes entity framework" the first result you get is this blog post. I would think that it would be preferable to maybe right an official guide for how one would solve this instead of hoping that the first result gives the correct answer. I would be happy to contribute with my Flux knowledge if there is interest in writing such a guide.

roji commented 2 years ago

Even in a situation where a migration was done in a CD pipeline there are no guarantees that the CD pipeline is going to run once. A couple of providers will allow triggering of multiple pipelines for the same commit, and most others will default to running the same pipeline in parallel for different commits. So it seems that the risk concurrency risk is present no mater where it is run.

I may be misunderstanding you, but if multiple deployments really are executing concurrently, wouldn't that mean that e.g. an actual production server might end up with mixed application files from two versions, as files are being copied from both onto it? It's hard for me to see how this could be workable: the actual act of copying the files to production - and executing migrations - seems like it must be protected by some sort of concurrency control seems necessary at the deployment stage.

But maybe I'm misunderstanding what you're describing - if so, more detail/context could help.

[...] I would think that it would be preferable to maybe right an official guide for how one would solve this instead of hoping that the first result gives the correct answer. [...]

I don't personally think the official EF documentation needs to address all possible scenarios with all possible technologies - there are many cluster and deployment technologies out there, and practically speaking we couldn't cover them all (nor would we be quailfied to do so). I don't see the problem with external documentation and blogs addressing scenarios such as this, especially since there's lots of diversity in exactly how people configure their clusters and deployment...

phillebaba commented 2 years ago

@roji I guess its difficult to explain without digging too deep into Kubernetes. I do agree with you that attempting to document every single deployment solution is not feasible. The issue is that the recommended solution would not work at all for other deployment solutions. Would a compromise be to collect a set of resources to point end users to known working solutions for different platforms?

OskarKlintrot commented 2 years ago

@phillebaba I don't understand how this has anything to do with EF really? Isn't it down to each CI/CD-solution to document? There're more tools than just EF bundle that you only want to run once pre- or post-deployment so it would make more sense if the CI/CD-provider documents how to do it rather than each individual tool. As an example, if I deploy db-changes (which there is a plethora of tools to choose from, DbUp is quite popular) I would read in the Azure DevOps or GitHub Actions (my providers) docs to understand how I would make sure that if one deployment is already running then the next in line waits for that one to finish or get cancelled before the next deployment runs. I don't see why I would look in the EF documentation for how to use GitHub Actions (as an example)?

phillebaba commented 2 years ago

@OskarKlintrot I think that the issue is that people expect problems such as migration concurrency to be solved in a project, without having to rely on other solutions for this. Additionally the documentation actually takes some sort of opinion, but only to the degree of "here is the keys to the car, but I am not telling you how to drive it".

I guess this is the documentation that states how things should be done with CI.

Bundles can be generated as part of your CI process and easily executed later as part of your deployment process.

So if you actually read my initial question instead of just repeating the same response, you would realize that it is a lot more nuanced than just saying solve it with Azure DevOps Pipelines or GitHub Actions. The question is what do you do when you are deploying in a context where these types of pipelines are not easily available. Here the answer seems to be not EFs problem, which is fine but I think it will affect adoption of EF. Then again I am not really a .NET developer so how would I know. All I know is that I get a lot of questions about how this should be solved with EF with no real good answer out there other than that some other tool should solve this.

roji commented 2 years ago

@phillebaba what kind of answer or capability do you think EF could provide to help with this? At the end of the day, EF can only produce a migration bundle or SQL script, but it's entirely up to the user's environment how those will get applied/executed, and from where. Some very concrete suggestions would help understand what you have in mind.

phillebaba commented 2 years ago

Honestly adding a DB lock before running the migration would solve these issues. Technically this could be done by developer beforehand but I do not think that it should be the developers responsibility

There are tools like Liquibase that already do this and work without any issue with init containers in Kubernetes. https://www.liquibase.com/blog/using-liquibase-in-kubernetes

roji commented 2 years ago

@phillebaba I addressed the issues with that here.

ElanHasson commented 2 years ago

Is there a way to make this work with dotnet pack?

roji commented 2 years ago

@ElanHasson can you provide more context on what you want to make work with dotnet pack?

ElanHasson commented 2 years ago

@ElanHasson can you provide more context on what you want to make work with dotnet pack?

I wrote pack, but I was thinking it should have been publish.

I was curious if dotnet publish could be used to run the bundler instead of the normal behavior.

Just trying to avoid modifying build scripts :)

ErikEJ commented 2 years ago

"Normal behaviour" IS dotnet publish 😄 https://github.com/ErikEJ/EFCorePowerTools/issues/1078#issuecomment-962194993 @ElanHasson