Open kferris10 opened 9 years ago
I'm having what appears to be the exact same issue and read through both #27 above and the referenced issue #22. I tried gc() like you suggested in 22, but it doesn't work on my machine just as it doesn't work above. What is the solution? I can restart R, but usually have to restart my machine for everything to run in a reasonable amount of time.
Related: I was trying to scrape data for all games starting on 03/01/2010 through the present by grabbing only one month at a time. R crashed midway through the games on 5/16/2012, so I restart, load my packages and define my connection, then run:
update_db(mysqlconnection, end="2012-05-20")
This starts getting the games from 5/17/2012 through 5/20, which obviously misses the remaining 5/16 games I didn't get to. How can I get the rest of the 5/16 games now without duplicating what I already have for that day?
@colemanconley I've never had an issue with duplicating games when using update_db
. My strategy is to first scrape one year of data. Then I can just run update_db
one year at a time. Is that not working for you?
I have these memory issues on windows but not on mac. The only way I've found to free up the memory is to restart the R session. What I do is make a new SNOW cluster with one node to run the scrape method each time, which is the same as having a new r session each time.
some code I use
ll <- seq(as.Date(start_date), as.Date(end_date), "1 year")
ntasks <- length(ll)-1
for(i in 1:ntasks) {
cl<-makeCluster(1, type="SOCK", outfile = "")
clusterEvalQ(cl, library(pitchRx))
clusterEvalQ(cl, library(DBI))
clusterEvalQ(cl, library(RSQLite))
clusterEvalQ(cl, library(dplyr))
clusterExport(cl, list = c("ll", "files", "dbpath"), envir=environment())
clusterCall(cl, function(i) {
db <- src_sqlite(dbpath, create = TRUE)
scrape(start = ll[i], end = ll[i+1], suffix = files, connect = db$con)
I am running into some errors trying to scrape large amounts of PITCHf/x data on my Windows 7 computer. Here are some screenshots to illustrate
I run this code to scrape several months of PITCHf/x data
appears to have no effectP.S. Sorry if those numbers are impossible to see. Let me know if it would help to improve the quality of any of the screenshots.