CornellLabofOrnithology / ebird-best-practices

Best Practices for Using eBird Data
https://CornellLabOfOrnithology.github.io/ebird-best-practices/
Other
31 stars 12 forks source link

Unzipping ebird data runs into an error #14

Closed igloopigloo closed 2 years ago

igloopigloo commented 2 years ago

Hi,

I am trying to do some occupancy modeling using ebird data. I am following instructions as given in best practices. So I downloaded both ebird and the sampling data set. Unzipping .tar file is okay but extracting ebird data set from this output, .txt.gz, is running into some data error.

Can anyone explain how to correct this situation? Will really appreciate it.

Thanks, Sprih

mstrimas commented 2 years ago

Can you provide a little extra information, what system are you on (e.g. Windows or Mac), which version of the ERD are you trying to unzip, is it the full ERD or a subset downloaded via the custom download form, etc.

igloopigloo commented 2 years ago

Hi,

This problem is happening for the full ERD and I downloaded the latest version (Mar2022) and I am using a Windows system.

But I found a way around this by using a custom download and following instructions after that (ebird Best Practices: 2.6 EBD file size issues).

But now I am stuck at different steps. Both read_ebd() and auk_zerofill() is giving me following error: Error in readr::read_delim(x, delim = sep, quote = "", na = "", col_types = col_types, : unused arguments (col_select = which(header != ""), name_repair = "minimal")

I am following the whole workshop material line by line. I updated my packages. I tried the whole thing on one another windows system. But I am running into the same issue.

Can you please help me out? Will be really grateful.

Thanks.

mstrimas commented 2 years ago

The full EBD is absolutely massive and sometimes errors when downloading can result it the file being corrupted, which is likely what's happened here. Unfortunately there's not much that can be done about this apart from what you did, i.e. downloading a smaller subset. I'm going to close this issue and address your other issue on the auk repo.