marklit / osm_split

Feature-specific, named OpenStreetMap (OSM) GeoPackage files
https://tech.marksblogg.com/extracting-osm-features.html
MIT License
21 stars 0 forks source link

Only capturing small amount of features in the split #3

Closed stefankinell closed 8 months ago

stefankinell commented 9 months ago

Really nice idea for a script. But I do not manage to get it working as expected when trying it for Sweden.

I ran it first for a small part - Gotland. And I realize only a small portion of the features in the pgf-file are parsed to the gpkg-files, if I am not mistaken.

One example - this is a picture from Visby.

Brown background is the multipolygon-layer in the pgf. The colored polygons above are the gpkg-files from the output folders.

image

One simple example - almost no buildings are capture in the output. Do I need to go through and set up the layers to capture myself?

This is the area on OSM https://www.openstreetmap.org/relation/2640780#map=16/57.6308/18.3001

marklit commented 8 months ago

Thanks for the kind words.

The other buildings are probably in separate GPKG files. For any city I would expect at least 25+ building files. They also might be the lines folder, not just the multipolygon folder.

In the example in the README of this project, there is a whole bunch of building types in the lines folder for Tokyo. You'd need to drag the whole lines/building folder in for those and then see if there are any in the other geometry folders as well.

Let me know if that fixes your issue and if not, I'll download the Swedish OSM data and see what's going on.

Also, if the building footprints are of value and not the OSM metadata, Overture might be easier to use. I used it for Tokyo in my https://tech.marksblogg.com/tokyo-walking-tour-guide.html post.

marklit commented 8 months ago

I ran the script for Gotland and in the lines/buildings folder there are 61 GPKG files. The script is still running but I expect there will be a bunch of building types in the other geometry folders as well.

$ wget https://download.geofabrik.de/europe/sweden-latest.osm.pbf
$ python ~/Desktop/osm_split/main.py \
    --only-h3=820897fffffffff \
    sweden-latest.osm.pbf
$ ls -lhS lines/building
1.5M Feb 11 18:22 house.gpkg
1.1M Feb 11 18:23 residential.gpkg
343K Feb 11 18:23 garage.gpkg
296K Feb 11 18:22 detached.gpkg
290K Feb 11 18:23 terrace.gpkg
193K Feb 11 18:22 industrial.gpkg
178K Feb 11 18:23 farm_auxiliary.gpkg
151K Feb 11 18:23 bungalow.gpkg
132K Feb 11 18:22 apartments.gpkg
 53K Feb 11 18:22 church.gpkg
 47K Feb 11 18:22 garages.gpkg
 42K Feb 11 18:22 retail.gpkg
 38K Feb 11 18:22 school.gpkg
 24K Feb 11 18:23 commercial.gpkg
 16K Feb 11 18:23 cabin.gpkg
 13K Feb 11 18:22 historic.gpkg
 11K Feb 11 18:22 hotel.gpkg
 11K Feb 11 18:23 service.gpkg
9.8K Feb 11 18:23 semidetached_house.gpkg
9.4K Feb 11 18:23 kindergarten.gpkg
9.1K Feb 11 18:23 hospital.gpkg
8.2K Feb 11 18:23 boathouse.gpkg
6.3K Feb 11 18:23 greenhouse.gpkg
5.5K Feb 11 18:23 public.gpkg
4.7K Feb 11 18:23 manufacture.gpkg
4.7K Feb 11 18:23 windmill.gpkg
3.7K Feb 11 18:22 office.gpkg
2.9K Feb 11 18:23 chapel.gpkg
2.5K Feb 11 18:24 ruins.gpkg
2.4K Feb 11 18:23 toilets.gpkg
2.2K Feb 11 18:24 bunker.gpkg
2.0K Feb 11 18:23 stable.gpkg
1.9K Feb 11 18:22 sports_hall.gpkg
1.9K Feb 11 18:23 carport.gpkg
1.8K Feb 11 18:22 train_station.gpkg
1.2K Feb 11 18:24 castle.gpkg
1.2K Feb 11 18:23 roundhouse.gpkg
1.2K Feb 11 18:22 hangar.gpkg
1.2K Feb 11 18:23 lighthouse.gpkg
1.1K Feb 11 18:23 civic.gpkg
1.1K Feb 11 18:23 warehouse.gpkg
856B Feb 11 18:24 storage_tank.gpkg
819B Feb 11 18:24 supermarket.gpkg
805B Feb 11 18:24 tower.gpkg
795B Feb 11 18:23 slurry_tank.gpkg
766B Feb 11 18:24 government.gpkg
760B Feb 11 18:23 dormitory.gpkg
722B Feb 11 18:24 pavilion.gpkg
665B Feb 11 18:23 kiosk.gpkg
663B Feb 11 18:23 static_caravan.gpkg
659B Feb 11 18:24 gatehouse.gpkg
618B Feb 11 18:24 water_tower.gpkg
558B Feb 11 18:24 college.gpkg
468B Feb 11 18:22 parking.gpkg
464B Feb 11 18:23 sauna.gpkg
418B Feb 11 18:24 transportation.gpkg
408B Feb 11 18:24 religious.gpkg
407B Feb 11 18:23 houseboat.gpkg
406B Feb 11 18:23 collapsed.gpkg
404B Feb 11 18:24 livestock.gpkg
398B Feb 11 18:23 power.gpkg
Screenshot 2024-02-11 at 18 27 45
stefankinell commented 8 months ago

Hi Thanks for answering and checking. I was expecting that objects that are in the mulitipolygon-layer of the pbf would end up in the output files of multipolygons. This is where I did my comparisons.

I will check when I get back to the computer but I fear that e.g. buildings that are polygons, with an opening inside, will then if exported to lines be difficult for me to process in later stages.

E.g. this building https://www.openstreetmap.org/relation/14979587

Is there a reason for saving multipolygons as lines?

marklit commented 8 months ago

The geometry is the same form as what is saved in the OSM files. The OSM project is using lines for the vast majority of buildings. The building you linked to shouldn't be the lines/ folder as its a multi-polygon.

It's possible to convert the lines into polygons.

$ mkdir -p lines_as_polygons

$ for FILENAME in lines/building/*.gpkg; do
    OUT=`echo $FILENAME | cut -d/ -f3`

    echo "
        COPY (
            SELECT * EXCLUDE(geom), 
                   ST_MakePolygon(geom) geom
            FROM ST_READ('$FILENAME')
        ) TO 'lines_as_polygons/$OUT'
          WITH (FORMAT GDAL,
                DRIVER 'GPKG',
                LAYER_CREATION_OPTIONS 'WRITE_BBOX=YES');" \
        | duckdb
  done

They appear to render just as they did before.

Screenshot 2024-02-11 at 19 14 44

The DuckDB Spatial extension might get support for wildcards at some point so the above could be a single SQL statement.

I could also look into adding a flag to convert lines to polygons in the code at a later point. #5

marklit commented 8 months ago

Sorry, I was just looking at that hotel example you gave. There is no hotel.gpkg file in the multipolygon folder. It appears they built it out of three ways that are made up of 36 points altogether. I'm not exactly sure where the geometry for this building is.

I'm also not sure if a building like that should be constructed on OSM using points and way relations. I haven't come across this sort of geometric setup for buildings. If it is valid to outline bulidings on OSM like this, I'll need to look into the relations later and see if a building like this could be converted into a multi-polygon. #4

marklit commented 8 months ago

I added support for the other_relations layer type in the OSM files #4.

The geom records for buildings in other_relations are feature collections of line strings. I've added a --polygon-buildings flag that will convert any geometry that is a building into a multi-polygon. I still have some issues with this #5. I'll have to pick it up when I'm back from holiday.

I'll close this ticket for now. Please do raise tickets for any new issues you encounter.