Closed rushgeo closed 2 years ago
We've gotten a number of error reports about this in the past week; my best guess is that the Census website is undergoing some maintenance or is having some issues. @loganpowell - do you have any thoughts on @rushgeo's suggestion?
Hi friends. If you're making a lot of calls to any Census address, there's a default policy that will block your IP. If you've been able to make successful gets and then - all of a sudden - are getting errors and then aren't able to get successfully after receiving the error the first time, this is probably happening to you. I have to do heavy pulls using wget
sometimes. In order to do so, I usually try to do it from a "throw-away" IP address (via VPN) and do everything in one sitting. Our Akamai caching layer will institute the block after some unknown time (within hours).
This doesn't sound like the scenario I'm experiencing. I'm having this happen from my first attempt on a new machine, and I'm also having intermittent success after previously having errors on another machine.
In that case, it's unrelated to the issue referenced. What are the addresses tigris
accesses?
I've mostly been downloading tracts, which for 2010 come from https://www2.census.gov/geo/tiger/TIGER2010/TRACT/2010/ if the cartographic boundary files aren't requested instead. The code that builds the URL is here.
Sorry for the delayed response. Are you still experiencing this issue?
Not sure if this is the same problem, but I have recently had trouble downloading county subdivisions. The following fails:
ma_towns_sf <- county_subdivisions(state = "MA", cb = TRUE)
I get the following message:
Using FIPS code '25' for state 'MA' error 1 in extracting from zip fileCannot open layer cb_2019_25_cousub_500k Error in CPL_read_ogr(dsn, layer, query, as.character(options), quiet, : Opening layer failed.
No problem accessing states. Just county subdivisions and smaller geographies, and sometimes it works. Using tigris version 1.4
@profLuna I just tested - it is working for me on my local version of R. I've also tested on my server version of R which took a little while to connect to the Census website but is working too. Are you running a server version of R? Downloads seem to fail more frequently there. I'd also always recommend using options(tigris_use_cache = TRUE)
to build a local cache rather than relying on data downloads.
@walkerke Thanks for the quick response. I am running a local version of R. Tried doing with and without a VPN, but same response. Definitely will set local cache to TRUE, although I'm stuck at the moment. Still weird because states and tracts work without a problem. It just seems to be county_subdivisions.
Hi, I can confirm this same behavior and the issue is ongoing.
Specifically, the link specified, for instance, by block_groups is valid for downloading when pasted into a browser. However, from the R environment it fails to download.
This one's a little tricky to test as I can't reproduce the error; however I'm wondering if heavy use of tigris temporarily clogs certain datasets on the Census website. For example, if I run:
> httr:::default_ua()
[1] "libcurl/7.58.0 r-curl/4.3.1 httr/1.4.2"
It's possible then that many R users are sending the same user agent to the Census website and intermittently blocking it, given that this user agent will be identical across tigris users with those versions. I'll do some more research on this.
Can you email our admin, @.***, about this?
Give her as much detail as possible
On Thu, May 27, 2021, 6:56 AM Kyle Walker @.***> wrote:
This one's a little tricky to test as I can't reproduce the error; however I'm wondering if heavy use of tigris temporarily clogs certain datasets on the Census website. For example, if I run:
httr:::default_ua() [1] "libcurl/7.58.0 r-curl/4.3.1 httr/1.4.2"
It's possible then that many R users are sending the same user agent to the Census website and intermittently blocking it, given that this user agent will be identical across tigris users with those versions. I'll do some more research on this.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/walkerke/tigris/issues/118#issuecomment-849536923, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA2ACPYUAXN274ER6VSDYHTTPYQNXANCNFSM44KUR72Q .
I am having the same issue. I can get states and block groups, but zctas fail:
zctas()
Previous download failed. Re-download attempt 1 of 3...
Previous download failed. Re-download attempt 2 of 3...
Previous download failed. Re-download attempt 3 of 3...
Error: Download failed; check your internet connection or the status of the Census Bureau website
at http://www2.census.gov/geo/tiger/.
It's been several months since I used tigris. At first I got the following:
ZCTAs can take several minutes to download. To cache the data and avoid re-downloading in future R sessions, set `options(tigris_use_cache = TRUE)`
Error: Cannot open "/private/var/folders/5_/l71sk6kn29z17n011g8kld5m0000gp/T/Rtmp8guaZD"; The source could be corrupt or not supported. See `st_drivers()` for a list of supported formats.
In addition: Warning message:
In unzip(file_loc, exdir = tmp) : error 1 in extracting from zip file
I then removed tigris and reinstalled from github, and now get the download error.
EDIT:
I tried to get zctas again just a minute after posting this, and it worked.
I am dealing with the same zctas error mentioned above:
zctas <- tigris::zctas()
# error: Download failed; check your internet connection or the status of the Census Bureau website
Previous download failed. Re-download attempt 1 of 3...
Previous download failed. Re-download attempt 2 of 3...
Previous download failed. Re-download attempt 3 of 3...
Error: Download failed; check your internet connection or the status of the Census Bureau website
at http://www2.census.gov/geo/tiger/.
I've been experiencing it for about 24 hours, but am not sure if it takes more time for someone to be unblocked if they've made multiple requests. Like jzadra pointed out, zctas seems to be the only geometry affected by this error, but again, I'm not sure if that's because it is the geometry I've been querying most frequently.
I just ran zctas()
successfully. I would strongly recommend using shapefile caching with options(tigris_use_cache = TRUE)
if you are frequently requesting ZCTAs. This will store the shapefile on your computer and use the local cache instead of downloading from the Census website each time and risking this issue.
I'm definitely going to use options(tigris_use_cache = TRUE) in the future, but unfortunately, I didn't use that option when I was first scripting. Do you happen to know how long it usually takes for the issue to resolve itself?
I'm having intermittent problems downloading through
tigris
. Sometimes all three download attempts fail, and other times they succeed. When they fail, the output file in the cache directory will either be zero bytes, or a very short HTML error:Inspired by the discussion here, I added a browser user agent to the downloads. Specifically, I added:
user_agent("Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:88.0) Gecko/20100101 Firefox/88.0")
in everyGET()
call intigris:::load_tiger
This seems to work every time, but I suppose I can't be 100% certain the user agent is doing the trick when there is still intermittent success without the patch.
Still, I wonder if it's worth either:
tigris
either all of the time, or after the first failed download attempt.