Josh-Lee1 / eBird-Fire-Index

BEES3041 Big Data Project
0 stars 0 forks source link

Matching with species data #7

Closed wcornwell closed 4 years ago

wcornwell commented 4 years ago

We should get out everything before and after from within the fire outline. There should be ~500,000 checklists which should correspond to 5,000,000 - 10,000,000 rows in the resulting dataframe so big but not tooooo big.

Josh-Lee1 commented 4 years ago

Didnt we count 136962+5788 checklists in the fire (before and after dec15). Where did you get the 500000 number from?

On Tue., 7 Jul. 2020, 10:39 Will Cornwell, notifications@github.com wrote:

Assigned #7 https://github.com/Josh-Lee1/eBird-Fire-Index/issues/7 to @Josh-Lee1 https://github.com/Josh-Lee1.

— You are receiving this because you were assigned. Reply to this email directly, view it on GitHub https://github.com/Josh-Lee1/eBird-Fire-Index/issues/7#event-3518125031, or unsubscribe https://github.com/notifications/unsubscribe-auth/AOENO3SXQ4VJEK4DRDRAMLLR2JVETANCNFSM4OSGUHRQ .

wcornwell commented 4 years ago

My bad--you're right. Should even be a bit smaller then

Josh-Lee1 commented 4 years ago

Hey @wcornwell and @coreytcallaghan, Ok I have had a go and pushed up some work (see matching_to_sp.R) I did a semi_join which looks like it got all of the data from the checklists of interest for the first file (baa). I then started to play with writing the function. Wasn't sure if we want to filter out some of the columns this time as well. It would be great if you could have a look so I can see if it runs! Thanks, Josh

coreytcallaghan commented 4 years ago

It looks pretty good to me! Could always test it with the 'test data files' we put in before, but other words, should be able to let it ride!

We can chat Thursday to see how it went?

Josh-Lee1 commented 4 years ago

Oh I'm sure I replied to this oops... Ok i'm pretty sure I've gotten the loop to work, its running now but it worked for the test folder. I am available before 10:30 and after 2 today if you want to discuss the next steps. Thanks

coreytcallaghan commented 4 years ago

After 2 is good.

On Thu, Jul 9, 2020, 9:24 AM Josh-Lee1 notifications@github.com wrote:

Oh I'm sure I replied to this oops... Ok i'm pretty sure I've gotten the loop to work, its running now but it worked for the test folder. I am available before 10:30 and after 2 today if you want to discuss the next steps. Thanks

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/Josh-Lee1/eBird-Fire-Index/issues/7#issuecomment-655807174, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGWSEJW2DBQ27AJXRZ3DFGTR2T5PHANCNFSM4OSGUHRQ .

Josh-Lee1 commented 4 years ago

@wcornwell are you available to chat this arvo?

wcornwell commented 4 years ago

sure. how about 4?

coreytcallaghan commented 4 years ago

Fine by me.

On Thu, Jul 9, 2020 at 11:18 AM Will Cornwell notifications@github.com wrote:

sure. how about 4?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/Josh-Lee1/eBird-Fire-Index/issues/7#issuecomment-655838786, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGWSEJX2MN5U2XT74K4RPK3R2ULCXANCNFSM4OSGUHRQ .

Josh-Lee1 commented 4 years ago

Great. See you then.

wcornwell commented 4 years ago

how's this going?

Josh-Lee1 commented 4 years ago

Something funny has happened to our dates. I am going to try and re run code to see at which point the data changes format.

wcornwell commented 4 years ago

try:

library(tidyverse)
library(lubridate)
setwd("rawspeciesdata/")

read_dat_function <- function(file_name) {

  dat <- readRDS(file_name)
  dat$OBSERVATION.DATE<-ymd(dat$OBSERVATION.DATE)
  return(dat)
}

files <- list.files("../rawspeciesdata/")
data <- lapply(files, read_dat_function)
wcornwell commented 4 years ago

but i'm still not getting enough checklists...

Screen Shot 2020-07-13 at 2 21 39 pm
Josh-Lee1 commented 4 years ago

Yeah i went back when I first had this problem and I think something has gone amis early on because there are checklists from the 1980s but I'm pretty sure I requested data from 2015 from ebird? Also I didn't filter out unwanted checklists in the second loop but that should have given us more than expected not less...

wcornwell commented 4 years ago

maybe 12060 is right, but I thought it would be ~140,000 from calculations on #6

wcornwell commented 4 years ago

other than less data, it looks pretty good....

Screen Shot 2020-07-13 at 2 46 43 pm
wcornwell commented 4 years ago

just pushed some changes. looking good. i'm still a bit worried we're dropping data somewhere, but even with what we have there are some cool results already

Josh-Lee1 commented 4 years ago

Great, thanks heaps Will. I am just trying to figure out what things mean and how to make it run.

Josh-Lee1 commented 4 years ago

On struggle street a bit with this step. I've been playing with the code and understanding parts but not been able to make other parts run for me. Will the results of the final step of the code you wrote make up our "before_after" variable for the GAM? Sorry to be slow! Maybe its best to get my questions answered in our next meet.

coreytcallaghan commented 4 years ago

Happy to chat tomorrow morning (before 11), or at 1 tomorrow.

On Wed, Jul 15, 2020 at 8:47 PM Josh-Lee1 notifications@github.com wrote:

On struggle street a bit with this step. I've been playing with the code and understanding parts but not been able to make other parts run for me. Will the results of the final step of the code you wrote make up our "before_after" variable for the GAM? Sorry to be slow! Maybe its best to get my questions answered in our next meet.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/Josh-Lee1/eBird-Fire-Index/issues/7#issuecomment-658696568, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGWSEJTKQEIPL4NKIZGHOX3R3WCLTANCNFSM4OSGUHRQ .

wcornwell commented 4 years ago

Will the results of the final step of the code you wrote make up our "before_after" variable for the GAM?

yeah, but not perfectly set up for that yet.

9:30 tomorrow?

Josh-Lee1 commented 4 years ago

Yeah 9.30 works for me. Thanks heaps.

On Thu., 16 Jul. 2020, 15:33 Will Cornwell, notifications@github.com wrote:

Will the results of the final step of the code you wrote make up our "before_after" variable for the GAM?

yeah, but not perfectly set up for that yet.

9:30 tomorrow?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/Josh-Lee1/eBird-Fire-Index/issues/7#issuecomment-659169104, or unsubscribe https://github.com/notifications/unsubscribe-auth/AOENO3QOSDD4A4JQFXCRXK3R32GJ7ANCNFSM4OSGUHRQ .

coreytcallaghan commented 4 years ago

Ditto

On Thu, Jul 16, 2020, 3:33 PM Will Cornwell notifications@github.com wrote:

Will the results of the final step of the code you wrote make up our "before_after" variable for the GAM?

yeah, but not perfectly set up for that yet.

9:30 tomorrow?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/Josh-Lee1/eBird-Fire-Index/issues/7#issuecomment-659169104, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGWSEJUPPXFZMRDACO76QBLR32GJ5ANCNFSM4OSGUHRQ .