DataKind-DC / audubon-cbc

For the bird counters
MIT License
11 stars 18 forks source link

Prep for Analysis: Getting to know the CBC and NOAA Data #36

Open rectheworld opened 4 years ago

rectheworld commented 4 years ago

Prep for Analysis: Getting to know the CBC and NOAA Data

Goal:

Preform task required for analysis and compute tables and counts to help us understand the data better.

Inputs:

cbc_cleaned_usa_merged.csv - Contain volenteer submitted cbc data and the weather data for the closest noaa station. Distances are measured in meters.

Outputs:

A Notebook file containing the tasks mentioned

TASKS

Counting:

Basic Stats and Investigations

ijd5004 commented 4 years ago

I started doing some of this stuff already. I will formalize it into a notebook and post it before the datajam this Saturday.

ijd5004 commented 4 years ago

Some of the temperature data has a negative sign in front of it. Do we know why?

rectheworld commented 4 years ago

Does it make sense for those to be a negative temperature for example -10°F?

Sent from my iPod Shuffle

On Feb 6, 2020, at 9:50 PM, ijd5004 notifications@github.com wrote:

Some of the temperature data has a negative sign in front of it. Do we know why?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.

ijd5004 commented 4 years ago

Very possible. I will look up the locations of a few places to see if it passes the sanity check (i.e. no negatives in florida)

ijd5004 commented 4 years ago

Looking at some of the negative temperatures in Florida. Tried to verify with a manual pull of data from a Gainesville station from the NOAA website. The temps do not match. Are we confident the NOAA temps are in a tenth of a degree F. -0.6 F is 30.92 C...so that would match if the NOAA data from Big Query was Celcius.

image

GHCND_USW00012816_1989-12-1.pdf

Frankie-Figz commented 4 years ago

NOAA temps are in tenth of a degree Celcius.

Negative Farenheit temperature in Florida sounds highly dubious

On Fri, Feb 7, 2020, 8:13 PM ijd5004 notifications@github.com wrote:

Looking at some of the negative temperatures in Florida. Tried to verify with a manual pull of data from a Gainesville station from the NOAA website. The temps do not match. Are we confident the NOAA temps are in a tenth of a degree F. -0.6 F is 30.92 C...so that would match if the NOAA data from Big Query was Celcius.

[image: image] https://user-images.githubusercontent.com/30261267/74076429-fed60a00-49e5-11ea-8bf7-8f39605ebff7.png

GHCND_USW00012816_1989-12-1.pdf https://github.com/DataKind-DC/audubon-cbc/files/4173573/GHCND_USW00012816_1989-12-1.pdf

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/DataKind-DC/audubon-cbc/issues/36?email_source=notifications&email_token=AJQJN6QTUNLVSDYEZRKFMVTRBYBMRA5CNFSM4KP2OZK2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOELFEI4A#issuecomment-583681136, or unsubscribe https://github.com/notifications/unsubscribe-auth/AJQJN6WNJXZDMTTIGW7PKBDRBYBMRANCNFSM4KP2OZKQ .

ijd5004 commented 4 years ago

Dubious indeed. Celcius makes much more sense.

ijd5004 commented 4 years ago

Tasks completed:

(I'm not convinced my outlier analysis is useful yet. More important is probably determining the distances, altitudes, and other measurements cutoff values between the circles and stations.)

ijd5004 commented 4 years ago

I believe I have answered all the questions in the prompt for this issue. We will definitely need a deep dive on outliers and visualizations, though.