crackernutter / EsriRESTScraper

A Python class that scrapes ESRI Rest Endpoints and exports data to a geodatabase
MIT License
48 stars 20 forks source link

Suggestion #2

Closed mmdolbow closed 8 years ago

mmdolbow commented 8 years ago

Hi, this is great. I have one suggestion and sorry I don't have time to put into a pull request. I frequently find endpoints with some garbage in them, and they have records posted with no geometry. I propose skipping those records entirely. Here's how I modded your script to do that, after line 227:

original : geom = self.__getGeometry(feature['geometry'])

            try:
                geom = self.__getGeometry(feature['geometry'])
                attributes = []
                attributes.append(geom)
                for field in self.updateFields:
                    if 'shape' not in field['name'].lower():
                        if 'date' in field['type'].lower():
                            try:
                                if len(str(feature['attributes'][field['name']])) == 13:
                                    attributes.append(datetime.datetime.fromtimestamp(feature['attributes'][field['name']] / 1000))
                                else:
                                    attributes.append(datetime.datetime.fromtimestamp(feature['attributes'][field['name']]))
                            except ValueError:
                                attributes.append(None)
                            except TypeError:
                                attributes.append(None)
                        else:
                            #getting strange OverflowError Python int too large to convert to C long, so casting section
                            #getting problem with some services where some fields aren't returned in results so added try/catch block
                            try:
                                newAttribute = feature['attributes'][field['name']]
                                if type(newAttribute) is long:
                                    if type(int(newAttribute)) is long:
                                        attributes.append(float(newAttribute))
                                    else:
                                        attributes.append(newAttribute)
                                else:
                                    attributes.append(newAttribute)
                            except KeyError, e:
                                attributes.append(None)
                cursor.insertRow(attributes)
            except:
                #no geometry
                print "no geometry in record, skipping"
                pass
crackernutter commented 8 years ago

Mike - glad you find this library useful! It's funny, I just came across this problem a week ago when I was investigating a new data source. Anyway, I made changes to handle null point geometry locally but didn't update the repo until this morning. I didn't exactly implement the solution you suggested, although that would work too. Everything should be working fine now. Thanks for the heads up!

mmdolbow commented 8 years ago

Awesome, I'm finding it very useful. I've come across another problem - REST endpoints where the field names have a "." in them, which get translated into underscores in the local FGDB, leading to a schema mismatch. I think I have a workaround and if I succeed, I'll do a PR. Awesome work, thank you for putting this together!