With this code, we will get a category_counts table which looks like this:
However, this is slightly incorrect. Instead of using count(), we would want to use nunique() instead. This will avoid counting the same species more than once. For example, for this code:
https://github.com/tekasian/intro-data-capstone-biodiversity/blob/297a56430354b6ef4894b9f0bcaa2b78767381c9/Biodiversity_Capstone_Project_Bryan_Leung/biodiversity.py#L165
With this code, we will get a
category_counts
table which looks like this:However, this is slightly incorrect. Instead of using
count()
, we would want to usenunique()
instead. This will avoid counting the same species more than once. For example, for this code:We will get this table:
The values are just slightly off, but it can make a difference in the long run. Just wanted to point that out!