Open trillevine opened 9 years ago
Hey Trill,
The only issue I ever noticed was a month or so ago when I was getting hit with server errors. There was no indication that ArcGIS Online was experiencing any issues, but I was unable to replicate anything, even when going through my service directory and doing it directly in ArcGIS Online. I gave it another go the next day and everything went as normal.
I didn't see any error messages in the script you posted. Can you provide some of the messages you got when running your version and my version?
On Mon, Jul 13, 2015 at 5:30 AM, trillevine notifications@github.com wrote:
Hi Brian,
I've been implementing a variation of restservices.py for a while now without problems. I split the service replication and attatchment downloading into two separate scripts (long story), but recently I've experienced problems with the service replication script. I just checked your script again and tried using it for replicating a service, but am having the same issues I had with mine. Something isn't working correctly with the JSON response, and I'm not sure what the problem is...I built in a try-catch block into the get_response def to catch json response problems, and I'm getting an endless loop of exceptions. Have you experienced anything like this recently with this script? Please let me know if you have a minute -- I've pasted my variation in below. Thanks!
- Trill
Code:
import json, urllib, urllib2, urlparse import os, shutil import time, datetime from datetime import date, timedelta import csv import logging
see all
# ---- values to be changed accordingly ----- # # #############################################
today = date.today() todayString = today.strftime("%Y.%m.%d")
---- Change service name_ as necessary ----
serviceTodayString = 'Forstmobil_'+ todayString
----- Change destination as necessary -----
change to directory where service_date folder will be created
os.chdir(r'G:\Dvkoord\GIS\TEMP\Tle\Scripts') if os.path.exists(serviceTodayString): shutil.rmtree(serviceTodayString)
make folder for today's download
os.mkdir(serviceTodayString) yesterday = date.today() - timedelta(1) yesterdayString = yesterday.strftime('%Y.%m.%d')
---- Change absolute path as necessary ----
serviceYesterdayString = r'G:\Dvkoord\GIS\TEMP\Tle\Scripts\Forstmobil_'+ yesterdayString
logging
logger = logging.getLogger(name) logger.setLevel(logging.INFO)
create a file handler
handler = logging.FileHandler('service_forstmobil.log') handler.setLevel(logging.INFO)
create a logging format
formatter = logging.Formatter('%(asctime)s - %(levelname)s - %(message)s') handler.setFormatter(formatter)
add the handlers to the logger
logger.addHandler(handler)
REST FUNCTIONS
def check_service(service_url): url_parts = {"fs_url": None, "layer_url":None, "layer_id":None} components = os.path.split(service_url) if service_url == None: return True elif (components[1].isdigit() and os.path.split(components[0])[1] == "FeatureServer"): url_parts["fs_url"] = components[0] url_parts["layer_url"] = service_url url_parts["layer_id"] = str(components[1]) return url_parts elif components[1] == "FeatureServer": url_parts["fs_url"] = service_url return url_parts else: return False
def get_service_name(service_url): components = os.path.split(service_url) if components[1] == "FeatureServer": return os.path.split(components[0])[1] else: return get_service_name(components[0])
def get_response(url, query='', get_json=True): opener = urllib2.build_opener( urllib2.HTTPHandler(), urllib2.HTTPSHandler(), urllib2.ProxyHandler( {'https': 'http://' + sys.argv[1] + ':' + sys.argv[2] + '@ip:port', 'http': 'http://' + sys.argv[1] + ':' + sys.argv[2] + '@ip:port'} )) urllib2.install_opener(opener) encoded = urllib.urlencode(query) request = urllib2.Request(url, encoded) if get_json: try: json_response = json.loads(urllib2.urlopen(request).read()) return json_response except: print 'exception' logger.info('JSON Error') RUN = App(INPUT_URL, TOKEN, DEST) RUN.pull_replica(REPLICA) return urllib2.urlopen(request).read()
def add_path(url, *args): for arg in args: url = urlparse.urljoin(url + "/", str(arg)) return url
def login (username, password): CREDENTIALS['username'] = username CREDENTIALS['password'] = password response = get_response(TOKEN_URL, CREDENTIALS) return response['token']
def get_service_info(input_url, token): return get_response(input_url, {'f':'json', 'token':token})
QUERIES
CREDENTIALS = { 'username': '', 'password': '', 'expiration': '300', 'client': 'referer', 'referer': 'www.arcgis.com', 'f': 'json' }
TOKEN_URL = "https://www.arcgis.com/sharing/rest/generateToken"
ATTACHMENTS = { 'where': '1=1', 'token': '', 'f': 'json', 'returnGeometry':'false' }
REPLICA = { "geometry": '', "geometryType": "esriGeometryEnvelope", "inSR": '', "layerQueries": '', "layers": '0', "replicaName": "read_only_rep", "returnAttachments": 'true', "returnAttachmentsDataByUrl": 'false', "transportType": "esriTransportTypeEmbedded", "async": 'false', "syncModel": "none", "dataFormat": "filegdb", "token": '', "replicaOptions": '', "f": "json" }
UPDATES = { "f": "json", "features": '', "rollbackOnFailure":True }
class App(object): ''' Class with methods to perform tasks with ESRI's REST API ''' def init (self, input_url, token, destination): self.input_url = input_url self.token = token self.destination = destination self.layer_url = None self.layer_id = None self.fs_url = self.check_input_url()
def check_input_url(self): url_parts = check_service(self.input_url) self.layer_url = url_parts["layer_url"] self.layer_id = url_parts["layer_id"] if not self.layer_url: self.layer_url = add_path(url_parts["fs_url"], "0") return url_parts["fs_url"] def get_root_name(self): return time.strftime("%Y_%m_%d_") + get_service_name(self.fs_url) def replicate(self, query): replica_url = add_path(self.fs_url, 'createReplica') zip_url = get_response(replica_url, query)['responseUrl'] zip_file = get_response(zip_url, get_json=False) pull_to_local(zip_file, self.get_root_name(), self.destination, 'zip') def pull_replica(self, query): query['token'] = self.token layers = get_service_info(self.fs_url, self.token)['layers'] if self.layer_id: query['layers'] = self.layer_id self.replicate(query) else: query['layers'] = [layer['id'] for layer in layers] self.replicate(query)
if name == "main":
## Required TOKEN = login("export_fluggs_mobil", "forstways1") INPUT_URL = "http://services1.arcgis.com/0cr41EdkajvOA232/ArcGIS/rest/services/Forstmobil/FeatureServer" ## Required for Pull Attachments and Pull Replica DEST = r"G:\Dvkoord\GIS\TEMP\Tle\Scripts" + "\\" + serviceTodayString ## Required for Update Service UPDATE_TABLE = "<table to update service>" ## Optional field to label folders by attributes for Pull Attachments FIELD = "" ## To return attachments in the geodatabase for replicate uncomment the line as follows: ## REPLICA[returnAttachments] = true RUN = App(INPUT_URL, TOKEN, DEST) RUN.pull_replica(REPLICA)
— Reply to this email directly or view it on GitHub https://github.com/bgeomapping/arcgis-rest-toolbox/issues/8.
Hi Brian,
Thanks for getting back to me. Here's the traceback when I comment out my try-catch block and use your original code in get_response:
Message File Name Line Position
Traceback
<module> G:\Dvkoord\GIS\TEMP\Tle\Scripts\Neu\arcgis-rest-toolbox-master\restservices.py 204
pull_replica G:\Dvkoord\GIS\TEMP\Tle\Scripts\Neu\arcgis-rest-toolbox-master\restservices.py 194
replicate G:\Dvkoord\GIS\TEMP\Tle\Scripts\Neu\arcgis-rest-toolbox-master\restservices.py 181
get_response G:\Dvkoord\GIS\TEMP\Tle\Scripts\Neu\arcgis-rest-toolbox-master\restservices.py 86
urlopen C:\Python26\ArcGIS10.0\lib\urllib2.py 126
open C:\Python26\ArcGIS10.0\lib\urllib2.py 397
http_response C:\Python26\ArcGIS10.0\lib\urllib2.py 510
error C:\Python26\ArcGIS10.0\lib\urllib2.py 435
_call_chain C:\Python26\ArcGIS10.0\lib\urllib2.py 369
http_error_default C:\Python26\ArcGIS10.0\lib\urllib2.py 518
HTTPError: HTTP Error 500: Internal Server Error
The reason I believe this is a JSON error can be found in this thread, which I started on Stack Overflow:
I just tried manually creating a replica via the web interface, and that's not working either...if the script is working for you, then my uess is that there's something up with the server that our stuff is hosted on.
Thanks again...
Trill
It looks like an error on ArcGIS Online's end. I just ran the script a few times and experienced no issues. I would recommend doing some basic investigation such as:
Let me know what you find out.
On Tue, Jul 14, 2015 at 4:04 AM, trillevine notifications@github.com wrote:
Message File Name Line Position Traceback
G:\Dvkoord\GIS\TEMP\Tle\Scripts\Neu\arcgis-rest-toolbox-master\restservices.py 204 pull_replica G:\Dvkoord\GIS\TEMP\Tle\Scripts\Neu\arcgis-rest-toolbox-master\restservices.py 194 replicate G:\Dvkoord\GIS\TEMP\Tle\Scripts\Neu\arcgis-rest-toolbox-master\restservices.py 181 get_response G:\Dvkoord\GIS\TEMP\Tle\Scripts\Neu\arcgis-rest-toolbox-master\restservices.py 86 urlopen C:\Python26\ArcGIS10.0\lib\urllib2.py 126 open C:\Python26\ArcGIS10.0\lib\urllib2.py 397 http_response C:\Python26\ArcGIS10.0\lib\urllib2.py 510 error C:\Python26\ArcGIS10.0\lib\urllib2.py 435 _call_chain C:\Python26\ArcGIS10.0\lib\urllib2.py 369 http_error_default C:\Python26\ArcGIS10.0\lib\urllib2.py 518 HTTPError: HTTP Error 500: Internal Server Error — Reply to this email directly or view it on GitHub https://github.com/bgeomapping/arcgis-rest-toolbox/issues/8#issuecomment-121159735 .
Hi Brian,
So i've been going back and forth with esri tech support on this, and it turns out that there's a bug with replicating larger geodb's with attachments. As a workaround, they suggested either setting returnAttachments in REPLICA to false (which isn't really an option for obvious reasons) or setting the async property to true. The issue i'm now having stems from that new async setting: responseUrl is no longer a returned parameter, so I get a key error when running the script. When I go through the rest web interface and set async to true and output to json, it gives me a parameter called statusUrl, which looks like this:
{
"statusUrl" : "http://services1.arcgis.com/0cr41EdkajvOA232/ArcGIS/rest/services/Forstmobil/FeatureServer/jobs/cbd843bf-a5da-40bb-9e28-cc5e9a14c92e"
}
According to the docs re: asynchronous operations (http://resources.arcgis.com/en/help/arcgis-rest-api/index.html#//02r3000000rt000000), once the replication is complete, it will give me a parameter called resultUrl, which I'm assuming I place in the definition of zip_url, like so:
zip_url = get_response(replica_url, query)['resultUrl']
When I do that, however, I still get a key error, so something's not working correctly. Based on the docs, I assume I have to build in some functionality to check the status of the replication process via statusUrl and then grab the resultUrl parameter when it's done, but I'm not sure how to do that. There's some more information here (http://resources.arcgis.com/en/help/arcgis-rest-api/index.html#/Create_Replica/02r3000000rp000000/), but I can't really make sense of it. Do you have any ideas? Any feedback is greatly appreciated as always. Thanks!
Trill
Hey Trill,
I'm slammed right now with work/personal life, I will try and take a look and see what we can do to modify the toolbox later this week on Thursday or Friday. Essentially we will need to find the link to the zip file that is provided via status url and then copy that down. Modifications will need to be made to the toolbox to add a checkbox for whether to async or not and then some alterations to route that into the tool methods.
-Brian
On Mon, Aug 24, 2015 at 8:20 AM, trillevine notifications@github.com wrote:
Hi Brian,
So i've been going back and forth with esri tech support on this, and it turns out that there's a bug with replicating larger geodb's with attachments. As a workaround, they suggested either setting returnAttachments in REPLICA to false (which isn't really an option for obvious reasons) or setting the async property to true. The issue i'm now having stems from that new async setting: responseUrl is no longer a returned parameter, so I get a key error when running the script. When I go through the rest web interface and set async to true and output to json, it gives me a parameter called statusUrl, which looks like this:
[image: statusurl] https://cloud.githubusercontent.com/assets/7161139/9439810/8f9b537c-4a6a-11e5-8b92-5b8f97f4d0f2.JPG
So I'm assuming that I need to change the zip_url variable in the replicate def to:
zip_url = get_response(replica_url, query)['statusUrl']
or something along those lines. When I do that, however, I get an empty zip folder, so nothing is being downloaded. I've been combing over the api docs and am at a bit of a loss as to how else I would need to modify to replicate def to get this working with async = true. Do you have any ideas? Any feedback is obviously greatly appreciated. Thanks!
Trill
— Reply to this email directly or view it on GitHub https://github.com/bgeomapping/arcgis-rest-toolbox/issues/8#issuecomment-134170443 .
Hi Brian,
Sure, whenever you get to it. I'm actually just working with the script (restservices.py), maybe that will be simpler to modify than the toolbox. Thanks for getting back to me.
Trill
Hey Trill,
I've got a working implementation when I set the query to "Async" = True. See below:
def replicate(self, query):
replica_url = add_path(self.fs_url, 'createReplica')
if REPLICA["async"]:
status_url = get_response(replica_url, query)["statusUrl"]
query = {'f':'json', 'token':self.token}
complete = get_response(status_url, query)['status']
while complete != 'Completed':
time.sleep(10)
complete = get_response(status_url, query)['status']
zip_url = get_response(status_url, query)['resultUrl']
else:
zip_url = get_response(replica_url, query)['responseUrl']
zip_file = get_response(zip_url, get_json=False)
pull_to_local(zip_file, self.get_root_name(), self.destination,
'zip')
This is a workable solution but no means comprehensive (if the result fails it gets put into an infinite loop). If you look on the async section of Rest API http://resources.arcgis.com/en/help/arcgis-rest-api/index.html#/Asynchronous_operations/02r3000000rt000000/ there is some logic that can be implemented to navigate the various responses on the status. When I get a chance to look through them I will flesh this out and post on github. Hope this works for you in the meantime.
-Brian
On Tue, Aug 25, 2015 at 10:19 AM, trillevine notifications@github.com wrote:
Hi Brian,
Sure, whenever you get to it. I'm actually just working with the script (restservices.py), maybe that will be simpler to modify than the toolbox. Thanks for getting back to me.
Trill
— Reply to this email directly or view it on GitHub https://github.com/bgeomapping/arcgis-rest-toolbox/issues/8#issuecomment-134601532 .
Hi Brian,
Awesome, thanks for getting to it so fast. I'll check it out tomorrow and let you know how it works...I'll also try to beef it up a bit and let you know what I can add.
Trill
Hi Brian,
This works for me, thanks again. I'll spend some time pimping this out and forward my changes.
Trill
Hi Brian,
I added some extra if else clauses in here, just in case it's a really large service and it takes a while for the zip url to reach completed status:
def replicate(self, query):
replica_url = add_path(self.fs_url, 'createReplica')
if REPLICA["async"]:
status_url = get_response(replica_url, query)["statusUrl"]
query = {'f':'json', 'token':self.token}
status = get_response(status_url, query)['status']
if status != 'Completed':
time.sleep(120)
status = get_response(status_url, query)['status']
if status != 'Completed':
time.sleep(120)
zip_url = get_response(status_url, query)['resultUrl']
else:
zip_url = get_response(status_url, query)['resultUrl']
else:
zip_url = get_response(status_url, query)['resultUrl']
else:
zip_url = get_response(replica_url, query)['responseUrl']
zip_file = get_response(zip_url, get_json=False)
pull_to_local(zip_file, self.get_root_name(), self.destination, 'zip')
I'm not sure how to prevent the infinite loop part, though.
I'll take a look sometime this week. The while loop will work, it just needs to be triggered off a change in state, i.e. the response no longer pending, etc and then some logic to work around this various messages (complete, failed and so forth). Pinging for a response every few minutes is not too taxing for anyone and could put in a trigger to timeout after some amount of time, although you have to consider that some of these services could be serious in size with attachments.
As I understand it, dealing with changes in state requires working with the threading module. I played around with it, but couldn't get it working....but yeah, having the script respond to a change in state from processing to completed would be ideal.
Hi Brian,
I've been implementing a variation of restservices.py for a while now without problems. I split the service replication and attatchment downloading into two separate scripts (long story), but recently I've experienced problems with the service replication script. I just checked your script again and tried using it for replicating a service, but am having the same issues I had with mine. Something isn't working correctly with the JSON response, and I'm not sure what the problem is...I built in a try-catch block into the get_response def to catch json response problems, and I'm getting an endless loop of exceptions. Have you experienced anything like this recently with this script? Please let me know if you have a minute -- I've pasted my variation in below. Thanks!
Code: