Closed paulfeitsma closed 6 years ago
It works well, you coded it right. (Congrats!) I'm just thinking of the different options... Should this appear only when there are duplicate cases for instance... And also, is it more suitable to report duplicates, as opposed to distinct rows...? Should there be an additionnal parameter to turn this on/off, and so on... Let me know your thought!
I believe it is both useful information when there are only unique rows (no duplicates) and when there are duplicates. So I believe in both cases you should show this. I also thought about showing the number of unique rows, but I believe it is more easy to interpret the number of duplicates (0 = good in most cases). I don't believe we should introduce a parameter for this, maybe in the future when someone suggests that.
You make a good case! Merging into the dev-current version. Thanks again
The current version of the Data Frame Summary shows the number of rows. In many cases it is very usefull to know how many unique rows there are. For example the iris dataset contains 150 rows, but there is one duplicate row (e.g. nrow(unique(iris)) gives 149). It would be very helpfull to add this to the top of the report.