Closed ateucher closed 9 years ago
Argh, except that only returns csv or xml
What's the difference between GET()
and ebird_GET()
?
ebird_GET
wraps GET
up with all of the common operations for getting stuff from the ebird API - makes it so there isn't so much code repeated in all of the functions that get data from the API.
So given that ebird_GET
is meant for dealing with eBird's API, then there's no need to use it in get_freq
?
I thought we could use to query the locations API endpoint to check locations
On Thu, Apr 23, 2015 at 1:45 PM, Sebastian Pardo notifications@github.com wrote:
So given that ebird_GET is meant for dealing with eBird's API, then there's no need to use it in get_freq?
— Reply to this email directly or view it on GitHub https://github.com/ateucher/rebird/issues/2#issuecomment-95714178.
We should write a new function that queries the locations endpoint of the eBird API, that can be called by get_freq to check the loc
parameter
I made a function that checks the API for matching locations, and it seems to be working solidly: https://github.com/ateucher/rebird/blob/frequency-data/R/eloc_check.R
I didn't use ebird_GET
as it relies on json, however LocationReference does not permit that format for some reason. Using GET
instead.
Still need to figure out a way to check hotspots, but I'm not sure if it's possible. I just checked what happens when you enter an invalid hotspot in rebird's ebirdhotspot
and it just returns an empty data frame, which is what get_freq
does as well.
That looks awesome - nice work! I made a few tweaks, hope they're ok with you
Regarding checking hotspots, what if we make the function throw a warning telling the user to double-check the hotspot code if nrow(freq) == 0 && loctype == 'hotspots'
?
I'm all for tweaking as I get to learn how to write better code by seeing your commits.
:+1:
Done warning if no observations returned when querying hotspot (7463d45ea5597b447b1434539ca4a9f8065a0a71). nrow(freq) == 1
instead of 0 as there is it still returns a row for Sample Size, but full of NAs.
In that case instead of nrow(freq) == 1
, what about all(is.na(freq))
. There could be a case where there is one row of valid data returned
I can change it to that. If there is one row of valid data then nrow = 2
. However, I can think of a case where this would fail: if a hotspot exists but there are no checklists in it, which could be tested by specifying a historical year span. I think this would return a data frame with one row full of zeros. Going to play with it now and change it to all(is.na(freq))
.
So even if querying a hotspot for a period with no checklists, you still get a row almost full of NAs (freq[1,1] = "Sample Size:"
)
For example, a hotspot for a period with a single observation you get two rows:
h1 <- get_freq("hotspots", "L196159", 1969, 1969, long = FALSE)
head(h1)[, 30:38]
> Aug-1 Aug-2 Aug-3 Aug-4 Sep-1 Sep-2 Sep-3 Sep-4 Oct-1
>1 0 0 0 0 1 0 0 0 0
>2 0 0 0 0 1 0 0 0 0
If you query the next year, nrow(freq) = 1
but it is still just NAs, even when the hotspot does exist:
h2 <- get_freq("hotspots", "L196159", 1970, 1970, long = FALSE)
head(h2)[, 30:38]
> Aug-1 Aug-2 Aug-3 Aug-4 Sep-1 Sep-2 Sep-3 Sep-4 Oct-1
>1 NA NA NA NA NA NA NA NA NA
We could use all(is.na(freq[,-1]))
, however I can't imagine there being a case where we wouldn't get the same answer this way or using nrow(freq)
. I do agree that the correct check is for NAs
instead of counting the number of rows, so I'm going to change it now.
Done (e68700f5634289b1116d0ec2bdef709d8d2d9764).
API reference: https://confluence.cornell.edu/display/CLOISAPI/eBird-1.1-LocationReference. Make sure we use
ebird_GET()
(https://github.com/ateucher/rebird/blob/master/R/zzz.R)