atorger / nvdb2osm

The Unlicense
9 stars 2 forks source link

New Lastkajen #25

Closed matthiasfeist closed 2 years ago

matthiasfeist commented 3 years ago

So the new Lastkajen is out: https://lastkajen2-p.ea.trafikverket.se/ and it seems that you can't get the data by kommun anymore but only by län...

atorger commented 3 years ago

That was a bummer... I've sent an email to Trafikverket asking about it, maybe they haven't just added them yet.

Even if we need to split further later to make smaller work packages, if we start with Län we would need to split the shape files before the script runs, which will probably be much more work to implement.

I looked into one of the NVDB zip files and some of the contents have changed too, but it seems to be very minor so shouldn't be a problem.

atorger commented 3 years ago

I got answer from Trafikverket, it's set in stone and they won't do any municipality splits this time, so eventually we'll need to make our own split. I may look into that later, unclear when though as I've started working on some other projects.

(edit: actually I have started to look into it, but as a background task)

atorger commented 3 years ago

Seems like they have broken the STARTAVST/SLUTAVST and some other real number fields in the shape file data, geopandas can no longer parse the values:

WARNING - Value '0,625685351085803.000000000000000' of field Västerbottens_län_ShapeNVDB_DK_O_38_FunkVagklass.STARTAVST parsed incompletely to real 0. 2021-04-14 21:45:38,980 - WARNING - Value '0,818811758280823.000000000000000' of field Västerbottens_län_ShapeNVDB_DK_O_38_FunkVagklass.SLUTAVST parsed incompletely to real 0.

Seems like the problem is that there is both , and . in the number. I've sent a new email to Trafikverket about that issue.

atorger commented 3 years ago

Oh, if I actually had read on the Lastkajen website I would have seen this:

Problem med shape-filer

2021-04-09 Just nu har vi problem med beställningar i formatet shape. Berör alla företeelsetyper med sträckutbredning och attributvärden med flyttal. Flyttal omvandlas till integer(heltal)

matthiasfeist commented 3 years ago

Haha oh god. This seems to be a bigger topic. I guess if we can find some outlines of the municipalities then we can probably divide the files ourselves. There are for sure some commandline tools that can cut the shapefiles. Alternatively we can process the whole län and then use osmosis to generate smaller files from a large OSM file...

atorger commented 2 years ago

I have started the work to update the script to handle the data from new lastkajen. There will be a split script to split the läns files to kommun-sized files, and then update the nvdb2osm script to handle the minor changes in names etc.

The split script is already almost complete, there is some remaining problem that there is some junk going on at borders so I need to look into that. I've done a successful pass on the new data for one municipality with updated nvdb2osm, but need to do some more verification.

matthiasfeist commented 2 years ago

Awesome! How are you doing the split into he kommun-sizes? There was some discussion in the facebook group, that it might make sense to split the län in maybe even smaller chunks so it's easier to do one chunk in one go. I think maybe a simple rectangular raster would be fine as well?

atorger commented 2 years ago

I run split in a separate script that reads a Län shape.zip archive and then writes new zip archives one for each kommun. The borders are defined in a Lantmäteriet 1:100000 data file which I will upload as a part of this script, as it is important that the exact borders are kept the same between runs.

Then the commun shape files are processed normally with the old script, or almost normally, one need to tell it which kommun it is processing so it can keep track of the borders, otherwise the geometry that crosses borders will no be clean as different layers extend different distances outside the border.

I think it would be nice to have kommun as the upper unit, and then split each kommun more using a raster or some other fixed geometry. It would be possible to make custom splits as long as the geometry is made part of the script. One idea is to make an automated split based on number of segments (ie smaller areas in big cities and bigger in rural areas), and then record that split once as a fixed geometry that is then used as reference by the split script.

I think I can upload the first version tonight. I just want to check that a big municipality passes processing.

atorger commented 2 years ago

I have now pushed a patch that makes it possible to generate OSM from the new lastkajen data. However one needs to run the split script first to generate new zip files, example:

split_nvdb_data.py --lanskod_filter 25 Norrbottens_län_Shape.zip output

That will put zip files (Luleå.zip, Boden.zip ... etc) with shape files in the directory 'output'.

Then you run nvdb2osm normally, with a municipality filter set:

nvdb2osm.py -v --municipality_filter Luleå --railway_file=Järnvägsnät_grundegenskaper.zip output/Luleå.zip luleå.osm

There are likely a few bugs left (just got a crash when converting Stockholm...) as there has been some minor changes to NVDB data here and there and I have probably not fixed all, so there will be some follow on patches. A cloud batch run of all municipalities like we did before would be great so I can see which municipalities that fail and fix those issues specifically.

atorger commented 2 years ago

Stockholm is now passing as well. Closing this to make clear new Lastkajen is now working. Open new issues for any followup problems