reyemtm / agol-cache

A NodeJS script to download all layers within a public or protected ArcGIS Online Feature or Map Service as GeoJSON.
https://www.npmjs.com/package/agol-cache
MIT License
7 stars 4 forks source link

Stream to output files #4

Closed jwoyame closed 2 years ago

jwoyame commented 2 years ago

Thanks for this project Malcolm. I actually found it before I realized it was you!

I was using this with a very large dataset (Ohio census blocks), with a quarter million polygon records. The size of the dataset was causing the final JSON.stringify to fail with the error Invalid string length, which seems to be related to the amount of memory available to the Node process.

I made an alteration to the code that uses Node's streams feature to write the JSON to the output file as it's being downloaded, instead of caching to memory and then stringifying and writing at the end.

Here is a test to show it now works:

const cache = require('./lib/featureServiceToGeoJSON');

const url = 'https://services3.arcgis.com/gk1IzaxatNGyH1UK/ArcGIS/rest/services/Ohio_Districts/FeatureServer';
cache.featureServiceToGeoJSON(url, { folder: './output' }, (layers) => {
  console.log('done');
  console.log('output ' + layers.length + ' layers')
});

I also made some other minor edits:

reyemtm commented 2 years ago

Looks good. Thanks for contributing. I haven't used this in a year or so but glad it is still of use.

reyemtm commented 2 years ago

That's awesome you found a use for this. I have since realized you can add the layer into QGIS and attempt to export it from there, but I have had mixed results.

On Sun, Nov 28, 2021, 3:21 PM Jon Woyame @.***> wrote:

Thanks for this project Malcolm. I actually found it before I realized it was you!

I was using this with a very large dataset (Ohio census blocks), with a quarter million polygon records. The size of the dataset was causing the final JSON.stringify to fail with the error Invalid string length, which seems to be related to the amount of memory available to the Node process.

I made an alteration to the code that uses Node's streams feature to write the JSON to the output file as it's being downloaded, instead of caching to memory and then stringifying and writing at the end.

Here is a test to show it now works:

const cache = require('./lib/featureServiceToGeoJSON'); const url = 'https://services3.arcgis.com/gk1IzaxatNGyH1UK/ArcGIS/rest/services/Ohio_Districts/FeatureServer';cache.featureServiceToGeoJSON(url, { folder: './output' }, (layers) => { console.log('done'); console.log('output ' + layers.length + ' layers')});

I also made some other minor edits:

  • Make "Retry attempt X" text begin at 1 instead of 0.
  • The config documentation says that "debugging is now on be default", but the logic has debug as false by default, so I flipped the logic.
  • Add a pretty option that will make the output file human readable if desired (defaults to false).
  • Use .forEach instead of .map for looping over arrays.
  • The logger.error(error) calls were outputting "unknown", I changed them to logger.error(error.toString())

You can view, comment on, or merge this pull request online at:

https://github.com/reyemtm/agol-cache/pull/4 Commit Summary

File Changes

(4 files https://github.com/reyemtm/agol-cache/pull/4/files)

Patch Links:

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/reyemtm/agol-cache/pull/4, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABQ2HUKEVFIZIW5TU57X6ODUOKFNPANCNFSM5I5VSODQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

reyemtm commented 2 years ago

is there a way I can add you to my npm registry and you can publish a new update? also there are some critical updates that need incorporated - there are small but key...should be expanded to auto find - it appears you can have more than 1000 ids difference between the next ID and the app will just quit because it didnt find any - so you need to query ALL of the feature IDs, find the first and last ID and force the app to continue downloading until it reaches the end

reyemtm commented 2 years ago

I'll hopefully be publishing a new version soon with the above changes...

jwoyame commented 2 years ago

I was thinking about the object IDs issue too, I can take a look at that. I think it should also check the limit in case someone configured the service to offer less than 1000 records at a time.

I'm etchjon on NPM.

reyemtm commented 2 years ago

Good ideas. There's a ton of other things this could do, like pull down the aliases for domain fields, and symbology...if you want to add ideas to the readme or issues you can.

On Sat, Dec 11, 2021, 12:06 PM Jon Woyame @.***> wrote:

I was thinking about the object IDs issue too, I can take a look at that. I think it should also check the limit in case someone configured the service to offer less than 1000 records at a time.

I'm etchjon on NPM.

— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub https://github.com/reyemtm/agol-cache/pull/4#issuecomment-991714018, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABQ2HUPPIA7UX4JUU3BZVK3UQOAKBANCNFSM5I5VSODQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.