yonghah / esri2sf

Scrape features from ArcGIS Server REST API and create simple features dataframe
Other
138 stars 37 forks source link

esriUrl_isValid fails for ArcGIS Servers with Directory Browsing disabled and hidden feature services which require credentials #43

Closed farleyklotz closed 2 years ago

farleyklotz commented 2 years ago

Some server admins disable the ArcGIS Server Services Directory for a bit of security through obscurity and set credentials on services (and, if Services Directory is not disabled, may also hide folders from viewing unless the user has provided the proper credentials.)

esriUrl_isValid fails in this case with a 403 status. e.g.

Response [https://www.redacted_server_name.com/arcgis/rest/services/folder_name/service_name/FeatureServer/1]
  Date: 2022-01-21 05:10
  Status: 403
  Content-Type: text/html;charset=utf-8
  Size: 630 B
<html lang="en">
<head>
<title>
Error: Services Directory has been disabled.</title>
<link href="/arcgis/rest/static/main.css" rel="stylesheet" type="text/css"/>
</head>
<body>
<table width="100%" class="userTable">
<tr>
<td class="titlecell">ArcGIS REST Framework</td>

One Solution: Appending "?f=json" to the feature layer or feature service url being checked will return a status_code of 200 and the body of the response will be {'error': {'code': 499, 'message': 'Token Required', 'details': []}} It also works if the url is shortened back to the instance.

You would think that just adding the token (with paste0("?token=", token) would solve this, but doing this still returns the 403 error.

The modification would be at the top of the function (can use a GET or POST, both seem to work fine against my server).

# check url succeeds (url without ""?f=json"" will return a status code 400 for 'hidden' services/layers)
  urlError <- tryCatch({httr::http_error(httr::GET(paste0(url, '?f=json'))},
                        error = function(cond) {TRUE})

Results against my server showing status code 200


> httr::status_code(httr::GET(paste0(paste(myserver, 'arcgis/rest/services', folderName, serviceName, 'FeatureServer', '1', sep="/"), '?f=json')))
[1] 200
> httr::status_code(httr::GET(paste0(paste(myserver, 'arcgis/rest/services', folderName, serviceName, 'FeatureServer', sep="/"), '?f=json')))
[1] 200
> httr::status_code(httr::GET(paste0(paste(myserver, 'arcgis/rest/services', folderName, serviceName, sep="/"), '?f=json')))
[1] 200
> httr::status_code(httr::GET(paste0(paste(myserver, 'arcgis/rest/services', folderName, sep="/"), '?f=json')))
[1] 200
> httr::status_code(httr::GET(paste0(paste(myserver, 'arcgis/rest/services', sep="/"), '?f=json')))
[1] 200
> httr::status_code(httr::GET(paste0(paste(myserver, 'arcgis/rest', sep="/"), '?f=json')))
[1] 200
> httr::status_code(httr::GET(paste0(paste(myserver, 'arcgis', sep="/"), '?f=json')))
[1] 200
> httr::status_code(httr::GET(paste0(myserver, '?f=json')))
[1] 200

I have not checked this solution against Map Services or ArcGIS online.

Note: I'm not an R programmer, today is day 1 with building R packages... I'm a python / GIS guy. We've been using esri2sf for the last couple years to report from our ESRI feature services (although I usually connect to them in Python) and recently it has broken, which led me to this solution. Our recent security hardening has caused us some pain in solving some issues like this one.

jacpete commented 2 years ago

Hey @farleyklotz, just wanted to get on and let you know that I can confirm this is an issue and I had noticed it might be a problem a week or so ago in https://github.com/yonghah/esri2sf/issues/41#issuecomment-1010606527

Issue 2: Need to add token argument

I thought of this just recently that token arguments need added to basically every function in the package that pings and ESRI REST Server and this really proves the point. My problem is lacking access to test token verification functionality. It is interesting that the esriIndex can pick up a service that is locked behind authentication though. This is part of a issue in the package and I will create a separate issue to tackle it.

I think this hasn't been planned for in the creation of the package so far is because I and probably a good group of users generally don't access credentialed servers so the problem hasn't really showed up until recently. However, I do plan to work on adding this stuff in the next time I get a chance to work on the package. Thank you for all the great information on ways to handle this.

farleyklotz commented 2 years ago

Thanks @jacpete , if I get some more time to play with this I'll let you know. The rvest::html_element will also have the same challenges. Another solution would be to try the URL first and only run the esriUrl_isValid on failure, that way grabbing a token wouldn't require the esriUrl_isValid to be run first.

jacpete commented 2 years ago

@farleyklotz I believe this was fixed with #40 and #47. Please feel free to open this back up if you are still having issues. I do not have access to a server that has the html obfuscated to test this on, but I believe I was able to understand your issue and add in your suggested changes. Thanks for bringing this to our attention.