Closed daaronr closed 6 years ago
The api calls can take several minutes, takes me about 20-30 mins to download all the data. Or is it not completing at all? Either way it might be worth adding in some prints so we know it's working.
You are right, it takes a long time. I'll keep it running.
Just pushed a commit that makes it print out what it's doing - re-pull if you think it might me getting stuck
Thanks. Will give it another go. It seems to have gotten stuck here, maybe because of my internet connection flaking out:
> > donation_data <-
> + map(fundraising_page_data$pageShortName, get_fundraiser_donations) %>%
> + reduce(bind_rows) %>%
> + mutate(date_downloaded = S .... [TRUNCATED]
> [2018-01-09 23:17:15] [info] asio listen error: system:48 (Address already in use)
Yes it works -- There were a few warnings we could look into:
5: In bind_rows_(x, .id) : Unequal factor levels: coercing to character
8: In bind_rows_(x, .id) : Unequal factor levels: coercing to character
...but it seems to have worked brilliantly! Fantastic!
It took a long time, but I'm not sure how to check the timings (I'm new to R). When we share it publicly we should put some notes about how long it might take ... and perhaps some tips on how to run in quicker, for those who just want a (random?) selection of pages fairly quickly.
Ah great! And yes some timings would be good. It should be pretty easy to get it to print some estimates of how long it's going to take. Do you know roughly how long it took on your machine?
Regarding the warnings - they aren't an issue. The code produces a table with each api call, sometimes it (usually incorrectly) decides that one or more of the columns in these small tables is categorical. The bind_rows function stacks these tables together and complains if the categories in each table are different. To deal with having mismatched categories it (correctly) converts the columns to character columns. I'll look at stopping these warnings, or just documenting why they occur.
If I am looking at the file modified times correctly, it may have taken 33 minutes from (home BT connection, run at midnight)
Script hangs up at (console output below; it freezes there for several minutes):
Ideas?