cynkra / dm

Working with relational data models in R
https://dm.cynkra.com
Other
499 stars 50 forks source link

Feature Request: Learn keys from columnstore index metadata #1752

Open carlganz opened 1 year ago

carlganz commented 1 year ago

In SQL Server 2014 they added columnstore indexes, which improves performance for datawarehouses, but doesn't allow foreign keys as a tradeoff (see discusssion: https://dba.stackexchange.com/questions/78694/clustered-columnstore-indexes-and-foreign-keys)

I would think the same relational information the keys contain would also be available in the index metadata using a query like:

SELECT 
     TableName = t.name,
     IndexName = ind.name,
     IndexId = ind.index_id,
     ColumnId = ic.index_column_id,
     ColumnName = col.name,
     ind.*,
     ic.*,
     col.* 
FROM 
     sys.indexes ind 
INNER JOIN 
     sys.index_columns ic ON  ind.object_id = ic.object_id and ind.index_id = ic.index_id 
INNER JOIN 
     sys.columns col ON ic.object_id = col.object_id and ic.column_id = col.column_id 
INNER JOIN 
     sys.tables t ON ind.object_id = t.object_id 
ORDER BY 
     t.name, ind.name, ind.index_id, ic.is_included_column, ic.key_ordinal;

I haven't dug too deep into the source code for dm, but does adding support for columnstore indexes as an alternative to keys seem feasible?

krlmlr commented 1 year ago

Thanks. Was this restriction lifted in SQL Server 2016? I'd rather avoid working around a limitation in a version that will reach end of life soon-ish.

https://learn.microsoft.com/en-us/sql/relational-databases/indexes/columnstore-indexes-data-warehouse?view=sql-server-ver16

I also don't understand how columnstore indexes are an alternative to keys.

carlganz commented 1 year ago

We are using SQL Server 2017, and cannot add foreign keys because of the columnstore indexes. I was also perplexed by this, but maybe there is some way to extract the same relationships with the columnstore indexes (since presumably the SQL Server engine uses the columnstore to figure out these relationships or joins would be slower)

krlmlr commented 1 year ago

Thanks. If we can extract reliable information from the database, we can use it. I suspect that in this case this information is simply not available (joins can be made without the presence of FK constraints, it's the database's job to figure out the fastest way to do it). I'd consider a reproducible example that shows how to extract that info.

Alternatively, you can always define a function for your dm object that adds whatever keys are necessary.