Closed tacman closed 8 months ago
In HEAD, I instead added a "Non-null values" statistic (also appears in the --csv
output). This information is useful for this use case as well as others.
You can thus compare non-null values to unique values. Note that if the column contains nulls, then NULL counts as one additional unique value.
Thanks!
I've been installing this via "sudo apt install csvkit" but I think I need a ppm in order to get the latest version. Is one available?
I've had trouble following the installation instructions on Ubuntu via pip.
I only manage the PyPI package. Packages in Linux distributions are created independently.
I'll make a new release of the PyPI package shortly.
It would be valuable to me to know if the all the values are unique. There is a number of unique values, and at the of the csvstat command there is
That's lost with --csv, along with the frequency, so there's no easy way to know if the values are unique. For my purposes, I'm trying to find the primary key from a set of files, so knowing that the values are unique would be enormously helpful.
If the frequency count were included in the "freq" key, I could parse that and see if the top one was just 1, but adding "Values are unique" would be better. Of course, to determine primary key I'd also check "Contains null values".