Closed ImreSamu closed 7 years ago
No need to close. Just write a comment here when it's ready for testing. It's looking good already, but I'll make a few inline comments with some hints.
@olt , Next proposal! PTAL.(Please Take Another Look)
It would be great if you can get the system tests running. I've just extended the documentation a bit (https://github.com/omniscale/imposm3/#system-tests). Let me know if you need help with that.
@olt
Let me know if you need help with that.
quick feedback: Not critical, but original repo 'test-unit' failed in my environment ( godep go test ./... ), probably 2 file is missing:
see log: https://gist.github.com/ImreSamu/7d6858cf9c68bb501a59
Ah, sorry. I pushed the missing files. Please fetch/merge with master.
@olt: Can you help me ?
One more file missing ( for "Benchmark test" ) : "parser/azores.osm.pbf"
status:
I am proposing the following new parameters :
tags:
# if the osm object has only one "created_by" tag, than this options will keep this otherwise drop !
# default: "false" ( drop )
keep_single_createdby_tag: true
parsemetadata:
# Parsing metadata and create new tag from "version" metadata;
# the new tag name will be the parameter value ("osm_version")
# default : empty string -> not parsing this metadata and not add new tag to the cache.
create_tag_from_version: osm_version
create_tag_from_timestamp: osm_timestamp
create_tag_from_changeset: osm_changeset
create_tag_from_uid: osm_uid
create_tag_from_user: osm_user
json version
"tags": {
"load_all": true,
"keep_single_createdby_tag": true,
"parsemetadata": {
"create_tag_from_version": "osm_version",
"create_tag_from_timestamp": "osm_timestamp",
"create_tag_from_changeset": "osm_changeset",
"create_tag_from_uid": "osm_uid",
"create_tag_from_user": "osm_user"
}
},
Thanks for your work.
I would not make the internal tag names configurable. addCommonKey
is really a hack and people will corrupt their caches when they change the mapping. I suggested _version_
, etc. in my first comments.
The parser package should not depend on the config package, as this makes it more difficult to split the parser package from Imposm later. Also, globals should be avoided. The checks for each tag makes the code complex. A single parseMetadata variable should be enough. Suggestion: Add a .WithMetadata(true) method to the parser and pass that bool down to the parser functions.
Regarding keep_single_createdby_tag: Do you need nodes with only created_by? You won't be able access the metadata from nodes without any tags with the current version. Is that right?
Thank you for your answer. First, I would like to apologize for miscommunication my requirements,
My typical use case is not rendering, mostly OSM data analysis, and my vision is creating a first class support for this, if possible.
I am using : osm2pgsql --hstore-all --extra-attributes , and loading all osm tags with osm metadata to the hstore field, like this:
select tags from planet_osm_point limit 1;
tags
-------------------------------------------------------------------------------------------------------------
"power"=>"tower", "osm_uid"=>"244754", "osm_user"=>"Janjko", "osm_version"=>"2", "osm_changeset"=>"7083233"
(1 row)
I have lot of sql scripts, hard coded with "osm_user".."osm_timestamp" variable names, like others : https://github.com/search?q=osm_timestamp&ref=searchresults&type=Code So I would like to re-create this osm2pgsql hstore functionality with imposm3.
My problem is now : I can not rename easily the user ... timestamp meta variable names at the hstore_tags field. :(
Sometimes I only need some selected metadata variables and not all. And as I know this is a typical problem, for example: only "uid" + "timestamp" information needs for old tiger import visualisation (ref.) And for minimal cache storage I would like storing only the relevant fields, and not any more.
My current ( and not perfect ) implemetation now support selecting+renaming metadata variables :
{
"tags": {
"load_all": true,
"parsemetadata": {
"create_tag_from_version": null,
"create_tag_from_timestamp": "osm_timestamp",
"create_tag_from_changeset": null,
"create_tag_from_uid": "osm_uid",
"create_tag_from_user": null
}
},
"tables": {
"parsemetadata": {
"fields": [
{
"type": "id",
"name": "osm_id",
"key": null
},
{
"type": "geometry",
"name": "geometry",
"key": null
},
{
"type": "hstore_tags",
"name": "tags",
"key": null
}
],
"type": "geometry",
"type_mappings": {
"linestrings": {
"highway": ["__any__"]
}
}
}
}
}
root@f082e809be9f:/go/src/github.com/omniscale# psql -A -F"," -c "select tags from osm_parsemetadata limit 5;" > _tags.txt
root@f082e809be9f:/go/src/github.com/omniscale# cat _tags.txt
tags
"name"=>"Boulevard d'Italie", "highway"=>"primary", "osm_uid"=>"69102", "osm_timestamp"=>"2015-03-26T16:32:39Z"
"name"=>"Avenue des Pins", "oneway"=>"no", "highway"=>"residential", "osm_uid"=>"331938", "osm_timestamp"=>"2014-03-23T20:20:01Z"
"name"=>"Avenue de Monte-Carlo", "oneway"=>"yes", "highway"=>"residential", "osm_uid"=>"378737", "osm_timestamp"=>"2012-05-01T15:06:17Z"
"name"=>"Boulevard Louis II", "highway"=>"primary", "osm_uid"=>"805403", "osm_timestamp"=>"2012-08-23T22:26:47Z"
"name"=>"Avenue de l'Annonciade", "oneway"=>"no", "highway"=>"secondary", "osm_uid"=>"852996", "maxspeed"=>"30", "osm_timestamp"=>"2013-01-10T21:14:02Z"
(5 rows)
or only a single version
"tags": {
"load_all": true,
"parsemetadata": {
"create_tag_from_version": "_version_",
}
},
+ psql -A -F, -c 'select tags from osm_parsemetadata limit 5;'
+ cat _tags.txt
tags
"name"=>"Boulevard d'Italie", "highway"=>"primary", "_version_"=>"11"
"name"=>"Avenue des Pins", "oneway"=>"no", "highway"=>"residential", "_version_"=>"7"
"name"=>"Avenue de Monte-Carlo", "oneway"=>"yes", "highway"=>"residential", "_version_"=>"11"
"name"=>"Boulevard Louis II", "highway"=>"primary", "_version_"=>"6"
"name"=>"Avenue de l'Annonciade", "oneway"=>"no", "highway"=>"secondary", "maxspeed"=>"30", "_version_"=>"7"
(5 rows)
in the future i would like to create an experimental "hstore_tags_drop", and "hstore_tags_keep" mappings, for example loading only the name:* tags.
"tags": {
"load_all": true,
"keep_single_createdby_tag": true,
"parsemetadata": {
"create_tag_from_version": "osm_version",
"create_tag_from_timestamp": "osm_timestamp",
"create_tag_from_changeset": null,
"create_tag_from_uid": null,
"create_tag_from_user": "osm_user"
}
},
....
{
"args": {
"drop_keys": [
"osm_timestamp",
"osm_user",
"osm_version"
]
},
"type": "hstore_tags_drop",
"name": "tags",
"key": null
},
{
"args": {
"keep_keys": [
"osm_timestamp",
"osm_user",
"osm_version"
]
},
"type": "hstore_tags_keep",
"name": "meta_tags",
"key": null
},
{
"args": {
"keep_keys": [
"name:*",
]
},
"type": "hstore_tags_keep",
"name": "name_tags",
"key": null
}
Regarding keep_single_createdby_tag: Do you need nodes with only created_by?
imho:
At the master branch implementation:
You won't be able access the metadata from nodes without any tags with the current version. Is that right?
yes, This functionality is never needed for me.
I would not make the internal tag names configurable. addCommonKey is really a hack and people will corrupt their caches when they change the mapping. I suggested version, etc. in my first comments.
I agree that compatibility and easy support is very important, so I will rethink my design and requirements ...
Some ideas ( quick & dirty hacks) :
The parser package should not depend on the config package, as this makes it more difficult to split >the parser package from Imposm later. Also, globals should be avoided.
ok, I will try to refactor.
The checks for each tag makes the code complex.
My reason was to storing only the selected metadata fields, and this functionality add extra complexity.
I will rethink my requirements ....
These ideas seem very good ! I'm waiting for the merge, as I'm the author of #58. Thanks a lot for your work. Unfortunetaly, I can't help for now with Go.
About the created_by field, this key is dispearing slowly because this key is deprecated on OSM object. Are you talking about created_by on the changeset ?
About the created_by field, this key is dispearing slowly because this key is deprecated on OSM object.
very slowly ... see taginfo key : http://taginfo.openstreetmap.org/keys/created_by#overview
Yes, but you won't find new editors in this list : Potlatch2 (which has been released a long time ago), ID, Mapzen ... So it's less accurate every day.
Yes, but you won't find new editors in this list ... So it's less accurate every day.
I am doing data quality analytics and I have a limited knowledge about OSM editors.
For example yesterday (timestamp="2015-12-07T17:38:22Z" ) one of the user created this new (v1) node
web: http://www.openstreetmap.org/node/3882156807
xml: http://www.openstreetmap.org/api/0.6/node/3882156807
with only one "created_by"="JOSM"
tag
<osm version="0.6" generator="CGImap 0.4.0 (8709 thorn-02.openstreetmap.org)" copyright="OpenStreetMap and contributors" attribution="http://www.openstreetmap.org/copyright" license="http://opendatacommons.org/licenses/odbl/1-0/">
<node id="3882156807" visible="true" version="1" changeset="35811285" timestamp="2015-12-07T17:38:22Z" user="coloredstone" uid="1815671" lat="46.1396752" lon="14.3995777">
<tag k="created_by" v="JOSM"/>
</node>
</osm>
if I import this data to Postgis, then as I know :
imposm3 import
-> don't loadimposm3 diff
-> loadosm2pgsql
-> loadso when I compare
osm2pgsql
output with imposm3 import
-> not equal imposm3 import
output with imposm3 diff
-> not equal [1][1] I have started working on an experimental integration test script - for comparing imposm3 import
with imposm3 diff
, but I removed, because this pull request is already to big and I have some problems with comparing geometries. ( https://github.com/ImreSamu/imposm3/commit/6b51919527bdabc614f963fadd660d974b58bbdb )
For example yesterday (timestamp="2015-12-07T17:38:22Z" ) one of the user created this new (v1) node web: http://www.openstreetmap.org/node/3882156807 xml: http://www.openstreetmap.org/api/0.6/node/3882156807 with only one "created_by"="JOSM" tag
I don't understand how it happened. The node has been updated using Mapzen
and not JOSM
according to the changeset. I guess it's a copy/paste tags in Mapzen.
According to the wiki :
As of November 2012 many nodes, ways or relations still have a created_by tag, but these are slowly disappearing since this key is silently dropped by OSM editors like JOSM, iD, Potlatch, Potlatch2 when objects get modified.
It works, I just moved the previous node for a few centimeters. The tag has been removed by JOSM, I even couldn't see the tag in JOSM when I downloaded the node. http://www.openstreetmap.org/node/3882156807/history
Is-it possible to have a status about this PR ? Is there a list of tasks that need to be done ? (we may help if we can)
Hi @Gustry !
Is-it possible to have a status about this PR ?
imho: It is working, but the code is not optimal. ( I am learning Golang )
in the future:
Is there a list of tasks that need to be done ? (we may help if we can)
imho: the code ready for testing. If you have time - please check the curent version. :)
It seems great news. I'm gonna try soon ! Thanks !
Just for information, it could be nice if you provide a mapping file like the main one https://github.com/omniscale/imposm3/blob/master/example-mapping.yml with the metadata. I'm going to try again, I had some problem the last time I tried.
@Gustry
, it could be nice if you provide a mapping file like the main one
I have created some mapping files for testing. See in the /test/parsemetadata*.json :
Is this feature available already?
@vicgarvey
Is this feature available already?
Sorry, not yet ... this is only a working prototype.
Why this PR has been closed? Even if it's not very active, I think it's still better to have it opens, it has a better visibility
Why this PR has been closed?
imho:
Maybe the refactoring is the first step:
And after that
Even if it's not very active, I think it's still better to have it opens, it has a better visibility
Your issue is still open ( https://github.com/omniscale/imposm3/issues/58 )
And I will keep my branch 'ImreSamu:osm_metadata', so if somebody wants to reuse or examine some part of the code it is possible.
NOT YET FINAL ... need more work ... so I closed.