category_counts table is slightly incorrect

https://github.com/tekasian/intro-data-capstone-biodiversity/blob/297a56430354b6ef4894b9f0bcaa2b78767381c9/Biodiversity_Capstone_Project_Bryan_Leung/biodiversity.py#L165

With this code, we will get a category_counts table which looks like this:

However, this is slightly incorrect. Instead of using count(), we would want to use nunique() instead. This will avoid counting the same species more than once. For example, for this code:

category_counts = species.groupby(['category', 'is_protected'])\
                         .scientific_name.nunique().reset_index()

We will get this table:

The values are just slightly off, but it can make a difference in the long run. Just wanted to point that out!

tekasian / intro-data-capstone-biodiversity

category_counts table is slightly incorrect #3