kiselev-dv / gazetteer

OSM ElasticSearch geocoder and addresses exporter
http://osm.me
Other
98 stars 21 forks source link

fix object namings to follow GeoJSON standards #45

Closed cordovapolymer closed 7 years ago

cordovapolymer commented 8 years ago

Hi, in data extracts produced by Gazetteer objects don't follow GeoJson object naming because are all-lowercase.

Could you please fix that as GeoJSON specification requires:

A geometry is a GeoJSON object where the type member's value is one of the following strings: "Point", "MultiPoint", "LineString", "MultiLineString", "Polygon", "MultiPolygon", or "GeometryCollection".

cordovapolymer commented 8 years ago

any hope that it will be fixed?

kiselev-dv commented 7 years ago

Do you need a full GeoJSON compliant output? Or just geometry type keys?

cordovapolymer commented 7 years ago

Well, full_geometry will be compliant with GeoJSON geometry object specifications if you will fix geometry type keys first letter case. What do you mean by full GeoJSON compliant output?

kiselev-dv commented 7 years ago

I mean, that the whole object itself should fit the schema:

{
  "type": "Feature",
  "geometry": {
  },
  "properties": {
  }
}

I thought you want to open that in some gis or something, so changing geometry types case wouldn't be enough.

cordovapolymer commented 7 years ago

@kiselev-dv it's not necessary for my current use case, but it's a very interesting idea to pack all the data possible into geojson feature collection. I'm importing gazetteer dumps into mongodb which supports GeoJson objects for geospatial queries. So for importing gazetteer dumps now I have to preprocess them changing geometry type keys to valid GeoJson types.

kiselev-dv commented 7 years ago

I know about FeatureCollections but, the whole json will be gigabytes in size, and the only way to parse it will be something like SAX parser (but for json, not for xml). Now it's quite obvious how to parse the dump piece by piece.

But anyway, I'll take a look how to make output closer to GeoJSON specs.

cordovapolymer commented 7 years ago

@kiselev-dv well, I would just import it into mongodb and then process it in batches if required.

But anyway, I'll take a look how to make output closer to GeoJSON specs.

Thanks for your work, your project is indispensable.

kiselev-dv commented 7 years ago

Try this release https://github.com/kiselev-dv/gazetteer/releases/tag/Gazetteer-1.9rc1 with join --handlers out-gazetteer out=/tmp/osm.json.gz format_geojson=true

cordovapolymer commented 7 years ago

@kiselev-dv thanks for the rc, just have run it with java -Xmx4096m -jar bin/Gazetteer.jar join --handlers out-gazetteer osm.json.gz format_geojson=true and got linked-addr-obj for poi points which you have added in this release, but object namings are still lowercase, for some reason the format_geojson option has not been honored.

kiselev-dv commented 7 years ago

@cordovapolymer try Gazetteer-1.9rc2

cordovapolymer commented 7 years ago

@kiselev-dv Possibly there's a memory leak now, normally I'm running gazetteer with -Xmx4096, now even 6500 is not enough.

++ java -Xmx6500m -jar bin/Gazetteer.jar join --handlers out-gazetteer osm.json.gz format_geojson=true
2017-04-12 18:56:41.923  INFO  m.osm.gazetter.join.JoinSliceRunable - stripe1914.gjson.gz done in 0:00:00.226. 65 left
Exception in thread "main" java.lang.StackOverflowError
    at me.osm.gazetter.join.util.MemorySupervizor.getAvaibleRAMMeg(MemorySupervizor.java:26)
    at me.osm.gazetter.join.JoinExecutor.tryToExecute(JoinExecutor.java:174)
    at me.osm.gazetter.join.JoinExecutor.tryToExecute(JoinExecutor.java:187)
    at me.osm.gazetter.join.JoinExecutor.tryToExecute(JoinExecutor.java:187)
kiselev-dv commented 7 years ago

What country you are trying to process, I'll check that.

2017-04-13 6:38 GMT-03:00 cordovapolymer notifications@github.com:

@kiselev-dv https://github.com/kiselev-dv Possibly there's a memory leak now, normally I'm running gazetteer with -Xmx4096, now even 6500 is not enough.

++ java -Xmx6500m -jar bin/Gazetteer.jar join --handlers out-gazetteer osm.json.gz format_geojson=true 2017-04-12 18:56:41.923 INFO m.osm.gazetter.join.JoinSliceRunable - stripe1914.gjson.gz done in 0:00:00.226. 65 left Exception in thread "main" java.lang.StackOverflowError at me.osm.gazetter.join.util.MemorySupervizor.getAvaibleRAMMeg(MemorySupervizor.java:26) at me.osm.gazetter.join.JoinExecutor.tryToExecute(JoinExecutor.java:174) at me.osm.gazetter.join.JoinExecutor.tryToExecute(JoinExecutor.java:187) at me.osm.gazetter.join.JoinExecutor.tryToExecute(JoinExecutor.java:187)

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/kiselev-dv/gazetteer/issues/45#issuecomment-293841668, or mute the thread https://github.com/notifications/unsubscribe-auth/AApLaXMUwvl4rGliqnrzzKDJpEyCFExGks5rve0OgaJpZM4Jmm4p .

-- Thank you for your time. Best regards. Dmitry.

cordovapolymer commented 7 years ago

@kiselev-dv , I solved the memory issue by building gazetteer from the develop branch, if you'd build 1.9rc3, I could test if the issue persists in your build, I'm using Java 8, maybe there's something with library compatibility. I can see osm_id and osm_type in the output, but object naming is still lowercase. I'm running it with java -Xmx4096m -jar bin/Gazetteer.jar join --handlers out-gazetteer osm.json.gz format_geojson=true

kiselev-dv commented 7 years ago

I run this with java -Xmx4096m -jar bin/Gazetteer.jar join --handlers out-gazetteer out=osm.json.gz format_geojson=true handlers options parsing is quite ill designed, so that might be the issue.

cordovapolymer commented 7 years ago

@kiselev-dv , thanks it works with your command