This addresses issue 25, unable to scrape minor league gids.
I was able to fix this error by using tryCatch to ping the "...inning_all" urls to see if they exist. If not, download and parse the individual innings.
I added a nonMLB argument to the function arguments. The default is FALSE. Setting this to TRUE deploys the above-mentioned method.
We could do this same thing without the additional function argument, but I think adding that extra tryCatch in there for all gids might be overkill, and would affect performance with a large number of gids (like an entire season.)
The xml obs require a different parsing strategy for single innings. I split all the object parsing out into its own function called parseObs(). This function replaces lines 227-297 of scrape.R and places them at the end of the file.
I have also updated the roxygen lines, manual, namespace, etc...
Tests
devtools::install_github("keberwein/pitchRx", force=T)
library(pitchRx)
# Example from the documentation with run time.
start.time <- Sys.time()
data(nonMLBgids, package = "pitchRx")
aaa <- nonMLBgids[grepl("2014_06_02_[a-z]{3}aaa_[a-z]{3}aaa", nonMLBgids)]
dat <- scrape(game.ids = aaa)
end.time <- Sys.time()
end.time-start.time
# The first two gids have seven innings, the third has an inning_all.xml in the directory.
mixed_bag <- scrape(game.ids=c("gid_2010_06_01_lhvaaa_tolaaa_1",
"gid_2010_06_02_albaaa_nasaaa_1", "gid_2014_06_02_srcaaa_freaaa_1"), nonMLB = T)
# Traditional scrape is unchanged and works the same as before.
start.time <- Sys.time()
scrape(start = "2016-07-23", "2016-07-24", connect=con)
end.time <- Sys.time()
end.time-start.time
This addresses issue 25, unable to scrape minor league gids.
I was able to fix this error by using
tryCatch
to ping the "...inning_all" urls to see if they exist. If not, download and parse the individual innings.I added a
nonMLB
argument to the function arguments. The default is FALSE. Setting this to TRUE deploys the above-mentioned method.We could do this same thing without the additional function argument, but I think adding that extra
tryCatch
in there for all gids might be overkill, and would affect performance with a large number of gids (like an entire season.)The xml
obs
require a different parsing strategy for single innings. I split all the object parsing out into its own function calledparseObs()
. This function replaces lines 227-297 ofscrape.R
and places them at the end of the file.I have also updated the roxygen lines, manual, namespace, etc...
Tests