microsoft / DacFx

SQL Server database schema validation, deployment, and upgrade runtime. Enables declarative database development and database portability across SQL Server versions and environments.
https://aka.ms/sqlpackage-ref
MIT License
296 stars 16 forks source link

Sqlpackage shows a discrepancy on data compression when new partition is created #232

Open dconnolly131 opened 1 year ago

dconnolly131 commented 1 year ago

Steps to Reproduce: Full steps and example solution can be found in my repo https://github.com/DotdigitalDBA/sqlPackageTickets/tree/main/PartitionCompressionIssue

Basic Repro steps are below but ensure you check the repo above for detailed steps

  1. Partitioned table in source control using data compression=page syntax (meaning page compression on all partitions)
  2. Data moves on and an outside process creates extra partitions
  3. SSDT Shows data compression on every partition in source control and shows the extra partition as a discrepancy on the table
  4. SSDT will now constantly see this data compression on a new partition as a discrepancy
  5. Its not feasible to change source control every time a new partition is created as this is fluid and automated at different times for different tables.
  6. This causes Sqlpackage to do whole table rebuilds to “fix” when nothing is actually wrong. This can block up our app as it tries to do it on 4tb tables

Did this occur in prior versions? If not - which version(s) did it work in?

(DacFx/SqlPackage/SSMS/Azure Data Studio)

zijchen commented 1 year ago

Are the partitions added in the database project? If you rebuild the project it should resolve the relationships between the partitions and the data compression option.

llali commented 1 year ago

hi @dconnolly131 when # 3 happens does that mean this script needs to change? CREATE PARTITION FUNCTION pf_PartitionCompressionIssue AS RANGE FOR VALUES (1, 10, 100

I understand that you have concern about changing the source every time that happens but wondering doesn't this cause miss match schema with source and target?

dconnolly131 commented 1 year ago

@llali So it would need to for the Partition Function to resolve the discrepancy yes. But we have the option to ignore Partition Functions/Schemes which we do at our company as we have partitioned tables which creates a new partition for each day so we would have to manually update source control once a day if we didn't ignore.

The issue here is the DATA_COMPRESSION=PAGE functionality. If i define just :

WITH( DATA_COMPRESSION=PAGE) ON [ps_PartitionCompressionIssue] ([UserAccountID])

then i'm telling SQL i want all my partitions to be page compression.

SQLPackage then for some reason splits this out into a totally different syntax by using the partition scheme/function when it doesn't need to:

WITH (DATA_COMPRESSION = PAGE ON PARTITIONS (1), DATA_COMPRESSION = PAGE ON PARTITIONS (2), DATA_COMPRESSION = PAGE ON PARTITIONS (3), DATA_COMPRESSION = PAGE ON PARTITIONS (4)) ON [ps_PartitionCompressionIssue] ([UserAccountID])

This means i can ignore Partition Functions/Schemes all i want but i cant do anything about this data compression, which means we cant have CI/CD at our company because SQLPackage to try and "Resolve" wants to rebuild our 4TB tables.

tony-donley commented 1 year ago

I have struggled with the same issue and have never found an acceptable workaround. I would like to know what Microsoft's recommended solution is for using page level compression with dynamically-generated partitions.

llali commented 1 year ago

@dconnolly131 thanks for the info. I'll investigate this more to find an acceptable solution.

dconnolly131 commented 1 year ago

Hi @llali was there any further update on this? Thanks

ssreerama commented 1 year ago

Hey @dconnolly131, Sorry for the delay.

Below are my findings and question:

  1. I would like to know how the outside process adds new partition ranges.
  2. When you are redeploying without ignoring the partition function(pf)/Scheme(ps), are you ok with losing the partition on tables? as the source has less partition range than the target?
  3. Since the source is missing the newly added partition range and the target has it, the differences are causing to validation of the partition scheme dependencies (such as constraints/indexes) to rebuild the table. During the outside process, alters the partition on target. If you can update the same on the source, that would resolve the table-rebuilding issue and gives you the expected partitions.
  4. There is no issue with Data Compression. Even without compression, the table is rebuilt for data motion to update the partitions on the table. Both syntaxes are valid and same, WITH( DATA_COMPRESSION=PAGE) and WITH (DATA_COMPRESSION = PAGE ON PARTITIONS (1), DATA_COMPRESSION = PAGE ON PARTITIONS (2), DATA_COMPRESSION = PAGE ON PARTITIONS (3), DATA_COMPRESSION = PAGE ON PARTITIONS (4)).
  5. As per the design, we are making table data motions to consolidate all potential unbinding operations when the partition scheme is unbinding.

Thanks

ssreerama commented 1 year ago

Hi @dconnolly131 , Do you have any update on above ask, we really want to know what stops you to update the source project with the target database changes of partition schemes and functions. Thanks.

sqltechhead commented 11 months ago

We have a SQL agent job that runs once a day and checks all of our partitioned tables and creates new partitions where it is needed.

So its not feasible to be adding a partition into the database project once a day for tables that are partitioned daily. That would be a full time job for someone.

So while i agree that those 2 data compression syntaxes are the same, because its split out per partition in source control its the entire issue that is causing the discrepancy, if it was stored as All, it would be fine.

If i said i had 1000 tables that i partition with a daily partition function, would you recommend i update source control with the new partitions daily for each table?

tony-donley commented 11 months ago

I've also struggled with this problem and have never found a good solution.

It would be nice to have a set of publish profile flags that are granular enough solve this use case and not just ignore index properties across the board.

robnsilver commented 1 week ago

@ssreerama : This issue will impact anyone who implements rolling windows with partitioning in an automated fashion and pretty much blocks them from fully automating deployments. Would it be possible to add a flag to help us work around this and ignore these schema differences?

dconnolly131 commented 1 week ago

Hi @ssreerama , do we have an update on this issue. As @robnsilver says, this is a large issue for anybody that uses rolling windows for partitions and uses SSDT.