microsoft / DacFx

DacFx, SqlPackage, and other SQL development libraries enable declarative database development and database portability across SQL versions and environments. Share feedback here on dacpacs, bacpacs, and SQL projects.
https://aka.ms/sqlpackage-ref
MIT License
307 stars 18 forks source link

Changing Primary Key Constraint Name Causes Table Rebuild #295

Open asrichesson opened 1 year ago

asrichesson commented 1 year ago

Steps to Reproduce:

  1. Create a new SSDT (sqlproj) project in visual studio.
  2. Add a single table sql file like so:
    CREATE TABLE [dbo].MyTable(
    Id [BIGINT] IDENTITY(1,1) NOT NULL
    ) 
    GO
    ALTER TABLE dbo.MyTable ADD CONSTRAINT PK1 PRIMARY KEY CLUSTERED (Id)
    --ALTER TABLE dbo.MyTable ADD CONSTRAINT PK2 PRIMARY KEY CLUSTERED (Id)
  3. Build and deploy the dacpac to a server using sqlpackage or vs.
  4. Uncomment the PK2 constraint and comment out the PK1 constraint. You effectively changed the name of the primary key constraint.
  5. Build and deploy the dacpac to a server using sqlpackage or vs.
  6. Expected: the constraint PK1 is dropped and the constraint PK2 is created. Actual: The table is rebuilt with the message Starting rebuilding table [dbo].[MyTable]...

Table rebuilds are expensive and unwanted, especially on large tables with millions of rows. For a small change such as changing table constraints without changing any column info a table rebuild is unnecessary. It took me a while on a large table to figure out what was causing the table to be rebuilt when this was the culprit.

Attached is the project file to help reproduce. Demo.zip

(DacFx/SqlPackage/SSMS/Azure Data Studio)

Mnior commented 1 year ago

Why is this issue labeled as a bug? Improvement?

Using rename ({Project}.refactorlog file) does not create this overhead. Of course, it would be better if even at the final stage there were optimization heuristics, but the whole DacFx setting algorithm is already very simple and predictable.

Neither Git nor Visual Studio (Code) can track renames, but is it worth asking DacFx to do that?

There are dozens of similar optimizations, such as changing columns, when DacFx, for any inconsistency, makes the creation of a new table with data copying.

asrichesson commented 1 year ago

I'm not expecting sqlpackage to rename. I agree that sqlpackage should not be able to understand renames/refactors without the refactor log. What I am expecting is that sqlpackage can drop a constraint that doesn't exist in the dacpac and create a new constraint that exists only on the dacpac. The following example demonstrates this working already for unique constraints. I don't know why it's not working with primary key constraints (which is my main issue).

CREATE TABLE [dbo].MyTable(
    Id [BIGINT] IDENTITY(1,1) NOT NULL
) 
GO

GO
ALTER TABLE dbo.MyTable add CONSTRAINT unique1 Unique (Id)
--ALTER TABLE dbo.MyTable add CONSTRAINT unique2 Unique (Id)
GO

Deploy this table and then comment out the unique1 constraint and uncomment the unique2 constraint. This produces the proper result of dropping the old constraint and creating the new one.

Dropping Unique Constraint [dbo].[unique1]...
Creating Unique Constraint [dbo].[unique2]...
Update complete.
namangupta211 commented 10 months ago

Hi, this is happening by design, and there are multiple cases in which table recreation may happens. Though we cannot stop the table from rebuilding in such cases, we have created a new publish property “/p:AllowTableRecreation”, where

closing the issue now, Thanks

dzsquared commented 10 months ago

(reopening the issue because while we have added the property that enables users to protect their operations from table recreation, it is a reasonable request that we opt for lighterweight operations when they're available)

namangupta211 commented 10 months ago

[like] Naman Gupta reacted to your message:


From: Drew Skwiers-Koballa @.> Sent: Wednesday, October 25, 2023 7:30:12 PM To: microsoft/DacFx @.> Cc: Comment @.>; Assign @.>; State change @.***> Subject: Re: [microsoft/DacFx] Changing Primary Key Constraint Name Causes Table Rebuild (Issue #295)

(reopening the issue because while we have added the property that enables users to protect their operations from table recreation, it is a reasonable request that we opt for lighterweight operations when they're available)

— Reply to this email directly, view it on GitHubhttps://github.com/microsoft/DacFx/issues/295#issuecomment-1779921992 or unsubscribehttps://github.com/notifications/unsubscribe-auth/A2XYWH5OGKXIUOV6VYFTBHDYBFSEJBFKMF2HI4TJMJ2XIZLTSWBKK5TBNR2WLJDUOJ2WLJDOMFWWLO3UNBZGKYLEL5YGC4TUNFRWS4DBNZ2F6YLDORUXM2LUPGBKK5TBNR2WLJDUOJ2WLJDOMFWWLLTXMF2GG2C7MFRXI2LWNF2HTAVFOZQWY5LFUVUXG43VMWSG4YLNMWVXI2DSMVQWIX3UPFYGLAVFOZQWY5LFVIZDCOBVGMYDOMZUGWSG4YLNMWUWQYLTL5WGCYTFNSBKK5TBNR2WLKRTGQ2DONJWGUYTSOFENZQW2ZNJNBQXGX3MMFRGK3FMON2WE2TFMN2F65DZOBS2YSLTON2WKQ3PNVWWK3TUUZ2G64DJMNZZJAVEOR4XAZNKOJSXA33TNF2G64TZUV3GC3DVMWUTENZXGYYDSOJTGGBKI5DZOBS2K2LTON2WLJLWMFWHKZNKGE3TQNBYGQ4TANZRQKSHI6LQMWSWYYLCMVWKK5TBNR2WLKRSGE4DKMZQG4ZTINMCUR2HS4DFUVWGCYTFNSSXMYLMOVS2UMZUGQ3TKNRVGE4TRJ3UOJUWOZ3FOKTGG4TFMF2GK. You are receiving this email because you commented on the thread.

Triage notifications on the go with GitHub Mobile for iOShttps://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Androidhttps://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

IVNSTN commented 10 months ago

Expected: the constraint PK1 is dropped and the constraint PK2 is created.

I'd say that sp_rename would be enough if no other options in constraint definition were changed.

We face the same issue, and we have many renames in a huge old project where we are trying to apply coding standards and naming patterns. As a workaround we use RefactorLog feature which is greate but in an actively developed project leads to terrible amount of Git conflicts. If renaming could be done with less effort it would be great.

Mnior commented 9 months ago

we use RefactorLog feature which is greate but in an actively developed project leads to terrible amount of Git conflicts It’s not very clear what exactly the problem is, looking through the prism of the original question?

@dzsquared Judging by the responses from the topicstarter, this is beyond the scope of the current topic:

OK. Let's say we need to create a new issue as an improvement. But who should we ask? DacFx or VisualStudio?

DacFx Should we completely abandon the external refactorlog and have an internal very complex mechanism for comparing schemas not by name, but by structure? Because there will be conflicts. I can’t come up with something really completely incomprehensible right away, because... For me, refactorlog itself is a treasure trove of surprises. This is no longer optimization, but a complete change of concept.

The problem is that changing the structure is a very common task. It is not limited to changing the set of indexes. You can join or split tables, but there are no universal solutions for this. And to change the name of the key/index, refactorlog is the most correct way. This is a coding culture problem.

VisualStudio It is not clear how it will track changes in the structure? It essentially defines only the current structure, and the previous one is a vague concept. I added the K1 key, removed the key, and added K2 with a different name. Should this be considered as changing the old one or immediately adding a new K2? Save only if you have done build ? What if these builds didn’t go anywhere? The change search mechanism must be flexible, even know something like naming product version tags in git, so that you can compare versions and find all renamings.

And still, any of these mechanisms do not solve the main problem; the chain of changes A -> B -> C -> D -> E is not always equal to effective direct changes: A -> E, A -> D, A -> C. Moreover, structure changes A -> E themselves may have different options for changing data with identical structures.

And in determining how to optimize (manually), you need to know between what states there may be changes. Wherein:

One more side. If the developer does not attach importance to this, then he does not know that a more optimal work exists. If you attach importance to details, then you use all the available mechanisms ("refactorlog") and the problem seems to be "does not exist". It’s another matter when there is a problem and there are no implementations of its solutions. The good is the enemy of the best.

Therefore, I do not believe that we will be able to reach the hearts of developers.

PS: Sorry for this semblance of English.