NikoNeugebauer / CISL

Columnstore Indexes Scripts Library
Apache License 2.0
179 stars 54 forks source link

Error in the dbo.cstore_SuggestedTables #14

Open ZedZipDev opened 7 years ago

ZedZipDev commented 7 years ago

I run the procedure dbo.cstore_SuggestedTables on my big database:

Msg 2627, Level 14, State 1, Procedure cstore_SuggestedTables, Line 70 [Batch Start Line 0] Violation of PRIMARY KEY constraint 'PK#TablesT9A619291537632C9'. Cannot insert duplicate key in object 'dbo.#TablesToColumnstore'. The duplicate key value is (215723871). The statement has been terminated.

ktaranov commented 7 years ago

Hy, Oleg. Could you give more information:

  1. SELECT @@VERSION from your SQL Server
  2. List of table names
ZedZipDev commented 7 years ago

Hey, it is:

  1. SELECT @@VERSION Microsoft SQL Server 2016 (SP1-CU1) (KB3208177) - 13.0.4411.0 (X64) Jan 6 2017 14:24:37 Copyright (c) Microsoft Corporation Developer Edition (64-bit) on Windows Server 2012 Standard 6.2 (Build 9200: )

  2. I'll provide the list but how the list of table names can help? The matter is: I run the proc on other database with the same structure but with less size and it works successfully, it suggested me 4 tables.

ktaranov commented 7 years ago
  1. If you have Case Sensitive instance then Table1 and table1 different names. Also you can have same table names in different schemes.

Please run SELECT query from 155 line after insert into #TablesToColumnstore and try to find duplicated ObjectId.

Also you could remove PRIMARY KEY option for ObjectId column in create table #TablesToColumnstore in line 123.

ZedZipDev commented 7 years ago

I have found the reason: for one of tables the query returns 3 rows with different values in the 'Size in GB' field. But why it is?

ObjectId TableLocation TableName Row Count Min RowGroups Size in GB Cols Count String Cols Sum Length


215723871 Disk-Based [dbo].[nodes] 160271556 154 24.694 10 0 332
215723871 Disk-Based [dbo].[nodes] 160271556 154 9.547 10 0 332
215723871 Disk-Based [dbo].[nodes] 160271556 154 8.060 10 0 332

NikoNeugebauer commented 7 years ago

Oleg, I will take a look at this in the next couple of days and will post the questions/answers. Thank you for the patience.

ktaranov commented 7 years ago

Oleg, could you provide create table script for reproducing this problem? Can you repeat this issue on another instance?

ktaranov commented 7 years ago

@Oleg26Dev Did you solve your problem? If yes, could you close this issue?

NikoNeugebauer commented 7 years ago

I have had more reports on this issue and planning to work on it over the next days.

NikoNeugebauer commented 7 years ago

Unfortunately I am still unable to fully reproduce the issue... No one could share the schema so far.

ZedZipDev commented 7 years ago

Sorry for the delaying.

I have investigated the problem. This is the shorter script which reproduced the problem in my db: I have minimized the script, it gives me 3 records for the one table, then excluded sys.indexes and this script gives only one record, that is correct.

--inner join sys.indexes ind
--  on ind.object_id = p.object_id and ind.index_id = p.index_id
select t.object_id as [ObjectId]
        --, case ind.data_space_id when 0 then 'In-Memory' else 'Disk-Based' end 
        , quotename(object_schema_name(t.object_id)) + '.' + quotename(object_name(t.object_id)) as 'TableName'
        , replace(object_name(t.object_id),' ', '') as 'ShortTableName'
        --, isnull(max(p.rows),0) as 'Row Count'
        --, ceiling(max(p.rows)/1045678.) as 'Min RowGroups' 
        --, isnull(cast( sum(memory_allocated_for_table_kb) / 1024. / 1024 as decimal(16,3) ),0) + cast( sum(a.total_pages) * 8.0 / 1024. / 1024 as decimal(16,3))  as 'size in GB' 
        from sys.tables t
            inner join sys.partitions as p 
                ON t.object_id = p.object_id
            inner join sys.allocation_units as a 
                ON p.partition_id = a.container_id
            -- !!!!!!!! --- inner join sys.indexes ind
            -- !!!!!!!! --- on ind.object_id = p.object_id and ind.index_id = p.index_id
            left join sys.dm_db_xtp_table_memory_stats xtpMem
                on xtpMem.object_id = t.object_id
        where p.data_compression in (0,1,2) -- None, Row, Page
             and (select count(*)
                    from sys.indexes ind
                    where t.object_id = ind.object_id
                        and ind.type in (5,6) ) = 0    -- Filtering out tables with existing Columnstore Indexes
             and (@tableName is null or object_name (t.object_id) =@tableName)
        group by t.object_id, 
        --ind.data_space_id, 
        t.is_tracked_by_cdc, t.is_memory_optimized, t.is_filetable, t.is_replicated, t.filestream_data_space_id

These are rows from indexes for this table, it seems they are the reason of the error in the procedure select * from sys.indexes t where object_id=215723871

215723871 PK_nodes 1 1 CLUSTERED 1 3 0 1 215723871 IX_nodes_parent_id 3 2 NONCLUSTERED 0 7 0 0 215723871 IX_nodes_id_type 7 2 NONCLUSTERED 1 4 0 0 215723871 IX_nodes 9 2 NONCLUSTERED 0 4 0 0

NikoNeugebauer commented 7 years ago

Hi Oleg,

Thank you very much, for the details. If you can, please share here or send me via email the t-sql script creation for the table and indexes. Even with the hypothetical indexes I was not able to re-create the scenario. Are there any exotic settings on this server? Really need more info to understand the problem.

Best regards, Niko

ZedZipDev commented 7 years ago

Hi Niko, No any exotic settings on the server. Even more: we provide the script to create database to customers and they can use it on any their servers: SQL 2008-SQL2017.

ZedZipDev commented 7 years ago

Hi Niko, there is a script for this table.

Ah! I have detected one duplicated index, may be it is the reason? (in email)

Regards, Oleg.

2017-08-23 14:44 GMT+02:00 Niko Neugebauer notifications@github.com:

Hi Oleg,

Thank you very much, for the details. If you can, please share here or send me via email the t-sql script creation for the table and indexes. Even with the hypothetical indexes I was not able to re-create the scenario. Are there any exotic settings on this server? Really need more info to understand the problem.

Best regards, Niko

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/NikoNeugebauer/CISL/issues/14#issuecomment-324317644, or mute the thread https://github.com/notifications/unsubscribe-auth/AYfy4UAQOUNJZD7nNFH8Nyf71JX6R8rOks5sbB6ogaJpZM4NHe1u .

NikoNeugebauer commented 7 years ago

Hi Oleg,

I did not receive any emails, can you please try sending it again ? Thank you!

Best regards, Niko Neugebauer

ZedZipDev commented 7 years ago

Hi Niko, I've sent email with script. MfG, Oleg.

NikoNeugebauer commented 7 years ago

Hi Oleg,

I tried the script with different additional indexes, but it works just fine. Can you reproduce the error that occurs on this database anywhere else ?

Best regards, Niko

suggy1982 commented 6 years ago

I also got this error when running cstore_SuggestedTables.

The select into #TablesToColumnstore returns 2 rows for some tables with the same partition number, but different row counts?

I traced this to the sys.partitions table which contains 2 rows for the same object and partition_number. The tables which return 2 rows have both a clustered and non-clustered index, could that be the reason?

NikoNeugebauer commented 6 years ago

Need to investigate this ... Need to find some time...

aamazur commented 5 years ago

Hit similar (maybe same) issue (Cannot insert duplicate key in object 'dbo.#TablesToColumnstore'). In my case the issue happened because I had indexes for some of the tables on different file groups. Seems like grouping by ind.data_space_id is wrong as it might introduce duplicates. As a quick workaround, I just replaced "ind.data_space_id" in the GROUP BY clause with "case ind.data_space_id when 0 then 'In-Memory' else 'Disk-Based' end".

SQLWerner commented 5 years ago

Hey, bumped into exactly the same issue... I think the proposed solution is watertight and I'll create a pull request. @NikoNeugebauer: If you like I can send you the schema of my database.