DataKind-DC / audubon-cbc

For the bird counters
MIT License
11 stars 18 forks source link

Background

"For the Bird Counters"

The Christmas Bird Count is a tradition where each year bird enthusiasts gather during December-January for a one day bird counting extravaganza. Volunteers gather for a single day to cover a 15 miles diameter circle and attempt to count every bird that they see. The count has only grown in locations and volunteers over its 100+ year history .

The Audubon Society, which manages the Christmas Bird Count, has approach Data kind with two questions.

The primary question addressed by this repository is if volunteer submitted weather data is accurate and if Audubon should continue to expect volunteers to submit this data.

The secondary questions we are trying to answer are, "What are the geographic, socioeconomic, or climatic correlates of different types of CBC participation and effort? Is it possible to model and predict participation and effort?". (Keep in mind that some years, counts could be cancelled due to nasty weather. There will not be a record for that count during that year. Field counters, parties, hours, and distances make up the largest chunks of effort and are the most interesting to us.)

As of the time of writing (September 2020) only work on the primary question has been done.

Table of Contents

Team Members

Deliverables

1. Should the Audubon society continue to collect weather data during the Christmas Bird Count?

Volunteers have to date:

2. (On Hold) What relationship, if any, exists between geographic, socioeconomic, or climatic correlates features and CBC participation and effort for the Christmas birdcounter?

Volunteers For this Research Question will produce descriptive analysis and predictive modeling to examine if a relationship can be determined.

Given the data types and research questions for the Participation Research Question, it would be beneficial to have volunteers design table shells for metrics that would help answer the research question and model designs they would like to see implemented. These shells and designs should include notes on any data aggregations or transformations that would be required by a tech side.

Data

The data files are saved into google drive here: https://drive.google.com/drive/folders/1Nlj9Nq-_dPFTDbrSDf94XMritWYG6E2I

The raw Christmas Bird Count Data from the Audubon Socity is the file cbc_effort_weather_1900-2018.txt Subsequent data files are named after the notebook that produced them.

Onboarding

Quick Start

Cleanest Volunteer Submitted Data

The file 1.0-rec-initial-data-cleaning.txt is the cleanest data to date and limits the scope to circles inside the United Stated and was produced by the 1.0-rec-initial-data-cleaning.ipynb file. Always check to see if the files in Cloud Data have been updated since your last working session.

Official Weather Date: Google BigQuery and Weather data

To obtain the daily measures of weather data, we will be using the GHCN Daily database powered by Google [here] (https://console.cloud.google.com/marketplace/details/noaa-public/ghcn-d?filter=solution-type:dataset&id=9d500d1d-fda4-4413-a789-d8786fd6592e&pli=1)"

Take a look around. Of Note - each year of the dataset is divided into its own table. So 2019 data is in ghcn_2019.

Final Dataset Used for Analysis

The final dataset used for analysis the the result of taking the Cleanest Voulenteer Submitted Data as passing it through all the notebooks up to 1.3-rec-connecting-fips-ecosystem-data.ipynb.

This file is then uses in the 2.X analysis notebooks.
DataDictonary: https://docs.google.com/spreadsheets/d/1p0JmDn0sIwxFJ81fDVeoykOOIixqhq0_nfqnuBcwceM/edit?usp=sharing Dataset: 1.3-rec-connecting-fips-ecosystem-data.txt https://drive.google.com/file/d/14C6CWePp-X-6b0Y2kFh92AGxd_jxr16y/view?usp=sharing

Resources

Read these brief pages for background:

https://www.audubon.org/conservation/join-christmas-bird-count

https://www.audubon.org/christmas-bird-count-compiler-resources

http://www.audubon.org/sites/default/files/documents/updated_compilers_manual_jan_2013.pdf

Some column descriptions here (not comprehensive): http://www.audubon.org/sites/default/files/documents/cbc_report_field_definitions_2013.pdf

Configuration

This is a primarily Python/R setup. Please follow the standard practices of setting up a virtualenv if using Python.

Datasets

Note: Attic is now a repository for old data

Quick Start and Contributing

Quick Start

Anyone is able to contribute! Please follow the steps for a (hopefully) pain-free experience submitting a pull request (PR)

Before doing any work ...

Use the example notebook here to set up your notebooks if you are creating a new notebook.

This notebook will also provide an example of 1) How to structure your notebooks 2) Now to name your notebooks so they flow sequentially down the workflow and 3) how to name the output file. Output files will be named after the notebook that produced them.

If you are working on an issue, be sure to include the issue number in your commit messages. Example: "This is a commit message for issue #30". Using the # will autmatically tie your updates to the issue.

Contributing