roelderickx / ogr2osm

A tool for converting ogr-readable files like shapefiles into .pbf or .osm data
https://pypi.org/project/ogr2osm/
MIT License
59 stars 14 forks source link

Lack of important attributes when converting files #45

Closed Vectorial1024 closed 11 months ago

Vectorial1024 commented 12 months ago

I was recently working on some GeoJSON files generated by HOTOSM and trying to convert them into OSM files for further processing.

I then realized a few problems:

roelderickx commented 12 months ago

I was recently working on some GeoJSON files generated by HOTOSM and trying to convert them into OSM files for further processing.

I then realized a few problems:

* HOTOSM surely preserved the OSM object IDs in its GeoJSON data, but currently I cannot reliably apply those ID back to the OSM file

I searched for a random geojson on HOTOSM and indeed, the osm_id is exported:

{
    "type": "Feature",
    "geometry": {
        "type":"Point",
        "coordinates":[113.5526893,22.2019276]
    },
    "properties": {
        "osm_id": 9558442378, 
        ...
    }
}

After conversion to OSM using ogr2osm the osm_id is available as a tag:

<node visible="true" id="-1" lat="22.2019276" lon="113.5526893">
    <tag k="osm_id" v="9558442378"/>
    ...
</node>

That's the best we can do I'm afraid, as mentioned before in #24:

The data in shapefiles and OSM files do not correspond 1:1, there are no id values to copy over. However, there may be a solution for your requirement using a translation file, see https://github.com/pnorman/ogr2osm/issues/57.

You can replace shapefiles with any other ogr datasource, the input contains points, linestrings and polygons whie the osm output contains nodes, ways and relations. Linestrings are translated into ways and nodes, but the tags are only added to the ways.

* Some important fields e.g. "version" is missing, and the resulting OSM file therefore cannot be read by e.g. JOSM

There are undocumented parameters --add-version and --add-timestamp which you may find interesting.

roelderickx commented 11 months ago

I just stumbled upon this file in the OSWDataPipeline project where the id is set in the process_feature_post method of the translation file.

Vectorial1024 commented 11 months ago

The process_feature_post is indeed something that I did not notice... Indeed, that function can reliably transfer the OSM id to the OSM format.

But still, I noticed that some of the output data still contain -ve OSM ids. I have reviewed the HOTOSM GeoJSON data source and discovered that e.g. their MultiPolygon can implicitly refer to nodes without even writing down their OSM node ids. So naturally, the output contains -ve node ids, and then I cannot proceed.

So I guess not much can be done on this side, unless we are talking about assigning new OSM ids based on the max ids that are read from the OSM/GeoJSON file.

roelderickx commented 11 months ago

So I guess not much can be done on this side, unless we are talking about assigning new OSM ids based on the max ids that are read from the OSM/GeoJSON file.

There is another undocumented parameter --positive-id to obtain positive id's, in combination with --idfile you may be able to achieve what you want. The only issue is then to acquire the max id from the geojson and write it to a file beforehand.