MauroDataMapper-Plugins / mdm-plugin-database-sqlserver

Apache License 2.0
2 stars 1 forks source link

Summary metadata - don't calculate when not enough distinct values #26

Closed jamesrwelch closed 1 year ago

jamesrwelch commented 1 year ago

In the summary metadata calculation, there is some logic that stops summary metadata when there are too few distinct rows. This doesn't always appear to be working: see for example this from the NorthWind database with default settings:

Screenshot 2023-08-08 at 10 24 17
joe-crawford commented 1 year ago

In branch feature/fix-enums-sm of mdm-plugin-database, a new parameter Summary Metadata Minimum Threshold is added (summaryMetadataMinimumValue, default is 10).

Summary metadata values less than summaryMetadataMinimumValue will be rounded up to this value, and if the (possibly approximate) number of rows in a whole table is less than the value, summary metadata will not be generated.

Screenshot 2023-10-09 at 15 57 28