nc-minibbs / mbbs

A repository for the Mini-Bird Breeding Survey data
https://minibbs.us
Other
2 stars 0 forks source link

Update stop level xls script #119

Closed bsaul closed 1 month ago

bsaul commented 1 month ago

@IJBG I refactored the stop-level excel file import a bit; most just cleaning up/simplifying. You can test what I've done with:

devtools::load_all()

f <- list_stop_level_files()

process_stop_level_xls(f[1])

get_stop_level_xls_data()
IJBG commented 1 month ago

@IJBG I refactored the stop-level excel file import a bit; most just cleaning up/simplifying. You can test what I've done with:

devtools::load_all()

f <- list_stop_level_files()

process_stop_level_xls(f[1])

get_stop_level_xls_data()

There's a 60-row difference between the output I have stored in my stop_level branch, https://github.com/nc-minibbs/mbbs/blob/5fac4b2d5dcd4998f2d23f7dbc1384417bf14045/inst/extdata/stop_level_hist_xls.csv and running get_stop_level_xls_data(), but it's all Double-Crested Cormorant with count 0 from three routes in 2002. Not sure why they're excluded as part of this process, but it's not a problem. Input data's stable so there won't be other changes between the files in the future.

bsaul commented 1 month ago

There's a 60-row difference between the output I have stored in my stop_level branch,

I see why. I skip the first five (header, habittat, vehicles), but 3 files from 2002 don't have a row for vehicles and so the DCCO row gets skipped.

bsaul commented 1 month ago

There's a 60-row difference between the output I have stored in my stop_level branch,

I see why. I skip the first five (header, habittat, vehicles), but 3 files from 2002 don't have a row for vehicles and so the DCCO row gets skipped.

On a related topic, are we extracting the habitat and vehicles from these files anywhere?

bsaul commented 1 month ago

There's a 60-row difference between the output I have stored

Ok. Try again. Is there sill a difference? I get 154,940 rows.

IJBG commented 1 month ago

There's a 60-row difference between the output I have stored in my stop_level branch,

I see why. I skip the first five (header, habittat, vehicles), but 3 files from 2002 don't have a row for vehicles and so the DCCO row gets skipped.

On a related topic, are we extracting the habitat and vehicles from these files anywhere?

Not yet!

IJBG commented 1 month ago

There's a 60-row difference between the output I have stored

Ok. Try again. Is there sill a difference? I get 154,940 rows.

That fixed it, the 1420 extra rows are all for "Accipiter species"

bsaul commented 1 month ago

Cool. I think I'm done with this PR pending your approval