dcomtois / summarytools

R Package to Quickly and Neatly Summarize Data
522 stars 78 forks source link

suggestion: identify primary key of dataframe #40

Closed paulfeitsma closed 5 years ago

paulfeitsma commented 6 years ago

In the Data Frame Summary it would be very useful to identify which column contains the 'primary key' (as it is called in databases). A column could be the primary key when the number of rows in the data frame equals the number of distinct values. Of course not every table has a primary key, but that is also useful to mention.

dcomtois commented 6 years ago

Where / how would you see it displayed?

paulfeitsma commented 6 years ago

I thought about this and I think it is the nicest if we added a comment after the number of distinct values when the number of distinct values equals the number of rows, indicating this might be a primary or unique key. I proposed a solution in a pull request.

dcomtois commented 5 years ago

Thx for the suggestion and PR. After testing it, I realized that too many numeric quantities (weights, prices, measures of all kinds) would respond to the criterion of being unique. As a result, comment would be shown too often and would become noise rather than useful information.