Changing format for Lastkajen data

matthiasfeist commented 2 years ago

Hi. I just got an email from Trafikverket to all Lastkajen users. It seems that they are discontinuing the shape files and will only provide GeoDB or Geopkg files by the end of the year...

atorger commented 2 years ago

Sigh... they have been really good at keeping us busy with their frequent changes. This is a pretty big change, but in the long term it's probably to the better. Shape files is a pretty old-school format. I don't remember why I chose that format when I start making the script in the first place, but I think it was something with the good availability of open source tools/libraries.

Anyway, there's a few months left of shape files so I'll probably look into this some time after shape files actually disappear, this autumn or winter. Or if I get bored during summer. I'm really bad at planning :-).

matthiasfeist commented 2 years ago

Haha yes I know. Well, I just wanted to bring this to your attention. Maybe a good step could be to use a tool like ogr2ogr to convert the files to shape. But I'd assume even then there will be quite a lot of changes to be made in the script

atorger commented 2 years ago

Yes I'll look into what ogr2ogr can do as a first step, it may be the easiest way out. Could maybe call it automatically from the script so it would be transparent to the user. I would expect some issues with key/value names and content but I'll see. If there is a lot of issues it's probably better to try using the GeoDB format directly.

Geopkg is all layers already merged format I think, which is kind of good as much of my script and issues is around merging. However, there is an advantage of doing the merging of layers oneself too though as there is some content-aware "smartness" there to take care of corner cases, so I think I'll stick with GeoDB and separate layers approach despite the challenges.

atorger commented 2 years ago

Geopackage is not all layers merged, so I went for that one. Really easy to implement, very little code change as geopandas take care of most things. Patch coming soon. Most time seems to be spent waiting on my very slow computer... takes time to process these huge files.

atorger commented 2 years ago

I ran into an issue though which may be a bit tricky to fix, seems like NVDB stores some dates in a format that Fiona/Geopandas cannot parse and will then replace it with 'None' messing up the dates. I'll see if I can workaround that in some way.

atorger commented 2 years ago

The split script now accepts both shapefile and geopackage, and will always output geopackage, which then nvdb2osm works with.

There are some issues with date parsing logs at splitting but one can ignore those, it's just a lot of zero dates which fiona complains about. I had to add some more values of "undefined" as empty strings turned up instead of None here and there.

So now it should work with geopackage, but there could be some quirks left.

Trafikverket also seems to be changing some column names later on which they have not done yet, so there will have to be some adjustment later too.

matthiasfeist commented 2 years ago

cool. I'll take a look if I could adjust the pipeline to download the geopackages

matthiasfeist commented 2 years ago

I just tried it (with the latest version on master) but I get this error and then the splitting script halts:

2022-07-03 13:40:34,010 - split_nvdb_data - INFO - Saving NVDB-Antal_korfalt2.gkpg for Ockelbo
2022-07-03 13:40:34,046 - split_nvdb_data - INFO - Saving NVDB-Antal_korfalt2.gkpg for Nordanstig
2022-07-03 13:40:34,123 - split_nvdb_data - INFO - Saving NVDB-Antal_korfalt2.gkpg for Hofors
2022-07-03 13:40:34,158 - split_nvdb_data - INFO - Reading layer NVDB*Barighet (6 of 47 layers) from ../download/21.zip
2022-07-03 13:40:34,159 - split_nvdb_data - INFO - Reading layer Gävleborgs_län_Shape_NVDB_DK_O_10_Barighet from file zip://../download/21.zip!Gävleborgs_län.gpkg
Traceback (most recent call last):
  File "fiona/ogrext.pyx", line 271, in fiona.ogrext.FeatureBuilder.build
ValueError: year 0 is out of range

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/matthiasf/src/nvdb2osm-pipeline/workdir/nvdb2osm/split_nvdb_data.py", line 295, in <module>
    main()
  File "/Users/matthiasf/src/nvdb2osm-pipeline/workdir/nvdb2osm/split_nvdb_data.py", line 240, in main
    gdf = read_gpkg_layer(geometry_file, ln)
  File "/Users/matthiasf/src/nvdb2osm-pipeline/workdir/nvdb2osm/split_nvdb_data.py", line 73, in read_gpkg_layer
    gdf = geopandas.read_file(filename, encoding='cp1252', layer=layer_name)
  File "/Users/matthiasf/src/nvdb2osm-pipeline/python-env/lib/python3.9/site-packages/geopandas/io/file.py", line 139, in _read_file
    return GeoDataFrame.from_features(
  File "/Users/matthiasf/src/nvdb2osm-pipeline/python-env/lib/python3.9/site-packages/geopandas/geodataframe.py", line 422, in from_features
    for feature in features_lst:
  File "fiona/ogrext.pyx", line 1515, in fiona.ogrext.Iterator.__next__
  File "fiona/ogrext.pyx", line 277, in fiona.ogrext.FeatureBuilder.build
  File "/usr/local/Cellar/python@3.9/3.9.10/Frameworks/Python.framework/Versions/3.9/lib/python3.9/logging/__init__.py", line 1481, in exception
    self.error(msg, *args, exc_info=exc_info, **kwargs)
  File "/usr/local/Cellar/python@3.9/3.9.10/Frameworks/Python.framework/Versions/3.9/lib/python3.9/logging/__init__.py", line 1475, in error
    self._log(ERROR, msg, args, **kwargs)
  File "/usr/local/Cellar/python@3.9/3.9.10/Frameworks/Python.framework/Versions/3.9/lib/python3.9/logging/__init__.py", line 1589, in _log
    self.handle(record)
  File "/usr/local/Cellar/python@3.9/3.9.10/Frameworks/Python.framework/Versions/3.9/lib/python3.9/logging/__init__.py", line 1598, in handle
    if (not self.disabled) and self.filter(record):
  File "/usr/local/Cellar/python@3.9/3.9.10/Frameworks/Python.framework/Versions/3.9/lib/python3.9/logging/__init__.py", line 806, in filter
    result = f.filter(record)
  File "/Users/matthiasf/src/nvdb2osm-pipeline/python-env/lib/python3.9/site-packages/fiona/logutils.py", line 18, in filter
    if getattr(record, 'msg', "").startswith("Skipping field"):
AttributeError: 'ValueError' object has no attribute 'startswith'

atorger commented 2 years ago

ValueError: year 0 is out of range is normal due to that NVDB has lots of dates set to zero and the fiona library that reads geopackage files doesn't like that. However, it should be okay to ignore those errors. What seems to go wrong there is something with logutils.py handling that error... strange that I didn't get that, probably due to that I have some different version of the fiona library.

Could you perhaps see if you can upgrade the fiona package?

atorger commented 2 years ago

There a few ways to fix the ValueError problem so it does not show at all is to 1) make trafikverket provide sane date values, 2) make fiona maintainers accept 0 as a valid date, or 3) write an own geopackage reader (which actually is not that hard as geopackage files are pure sqlite3 databases).

However as it works for me despite those ugly logs I have ignored it for now...

(I think fiona is part of the geopandas package so if fiona is not installed separately you can try to upgrade geopandas)

matthiasfeist commented 2 years ago

Hmmm... I don't install anything myself but only use the requirements.txt file in your repo to install dependencies. Does that mean you need to update this file to reflect the newest versions?

atorger commented 2 years ago

hmm yeah maybe, actually it's not me that made the requirements.txt that was some python guru which contributed some patches early on. I have installed my packages manually before that and I have 0.11.0 installed, the requirement file says 0.8.2 maybe that's bad...

atorger commented 2 years ago

I pushed a new requirements file.

If already installed it seems like this works:

pip install --upgrade -r requirements.txt

otherwise a fresh install should work, then it should get the latest geopandas hopefully

matthiasfeist commented 2 years ago

now I get a new error that I haven't seen before. I completely installed all dependencies from scratch with your new requirements file. I think it would be best to really specify the exact library versions there that you use as well (I think currently you are specifying only that e.g. geopandas should be at the latest version, not the exact version you want).

here's the error:

python split_nvdb_data.py --lanskod_filter 17 ../download/17.zip output/
2022-07-04 08:55:15,141 - split_nvdb_data - INFO - args are Namespace(geo_file=PosixPath('../download/17.zip'), output_dir=PosixPath('output'), lanskod_filter='17', loglevel=30)
2022-07-04 08:55:15,141 - split_nvdb_data - INFO - Checksum for each script file (to be replaced with single version number when script is stable):
2022-07-04 08:55:15,142 - split_nvdb_data - INFO -   split_nvdb_data.py     MD5: e2eeaa228f028e6769ea92bb631e944e
2022-07-04 08:55:15,142 - split_nvdb_data - INFO - Loading municipality borders
2022-07-04 08:55:15,144 - split_nvdb_data - INFO - Reading file ak_riks.shp
Traceback (most recent call last):
  File "/Users/matthiasf/src/nvdb2osm-pipeline/workdir/nvdb2osm/split_nvdb_data.py", line 295, in <module>
    main()
  File "/Users/matthiasf/src/nvdb2osm-pipeline/workdir/nvdb2osm/split_nvdb_data.py", line 224, in main
    municipalities = load_municipalities(lanskod)
  File "/Users/matthiasf/src/nvdb2osm-pipeline/workdir/nvdb2osm/split_nvdb_data.py", line 129, in load_municipalities
    gdf = read_epsg_shapefile("data/ak_riks.zip", "ak_riks")
  File "/Users/matthiasf/src/nvdb2osm-pipeline/workdir/nvdb2osm/split_nvdb_data.py", line 109, in read_epsg_shapefile
    gdf = geopandas.read_file(gdf_filename, encoding='cp1252')
  File "/Users/matthiasf/src/nvdb2osm-pipeline/python-env/lib/python3.9/site-packages/geopandas/io/file.py", line 253, in _read_file
    return _read_file_fiona(
  File "/Users/matthiasf/src/nvdb2osm-pipeline/python-env/lib/python3.9/site-packages/geopandas/io/file.py", line 340, in _read_file_fiona
    df = GeoDataFrame.from_features(
  File "/Users/matthiasf/src/nvdb2osm-pipeline/python-env/lib/python3.9/site-packages/geopandas/geodataframe.py", line 652, in from_features
    return cls(rows, columns=columns, crs=crs)
  File "/Users/matthiasf/src/nvdb2osm-pipeline/python-env/lib/python3.9/site-packages/geopandas/geodataframe.py", line 170, in __init__
    self["geometry"] = _ensure_geometry(self["geometry"].values, crs)
  File "/Users/matthiasf/src/nvdb2osm-pipeline/python-env/lib/python3.9/site-packages/geopandas/geodataframe.py", line 62, in _ensure_geometry
    out = from_shapely(data, crs=crs)
  File "/Users/matthiasf/src/nvdb2osm-pipeline/python-env/lib/python3.9/site-packages/geopandas/array.py", line 150, in from_shapely
    return GeometryArray(vectorized.from_shapely(data), crs=crs)
  File "/Users/matthiasf/src/nvdb2osm-pipeline/python-env/lib/python3.9/site-packages/geopandas/_vectorized.py", line 146, in from_shapely
    aout[:] = out
  File "/Users/matthiasf/src/nvdb2osm-pipeline/python-env/lib/python3.9/site-packages/shapely/geometry/polygon.py", line 300, in __array_interface__
    raise NotImplementedError(
NotImplementedError: A polygon does not itself provide the array interface. Its rings do.

atorger commented 2 years ago

Try with 0.11.0, or if you can't install that, some earlier version. Normally one wouldn't want to specify too new versions and just run with the latest one can get, but if API keeps changing it's a mess...

matthiasfeist commented 2 years ago

Same error with 0.11.0... I wonder if it's maybe a different library that needs bumping?

atorger commented 2 years ago

Oh, "site-packages/shapely/geometry/polygon.py", seems to point to Shapely. Maybe try shapely 1.8.2

matthiasfeist commented 2 years ago

can you maybe update the requirements.txt file with the exact versions that you use locally on your machine?

atorger commented 1 year ago

Requirement already satisfied: Shapely>=1.7.1 in /home/anders/.local/lib/python3.10/site-packages (from -r requirements.txt (line 1)) (1.8.2) Requirement already satisfied: sortedcontainers>=2.1.0 in /usr/lib/python3/dist-packages (from -r requirements.txt (line 2)) (2.4.0) Requirement already satisfied: geopandas>=0.8.2 in /usr/lib/python3/dist-packages (from -r requirements.txt (line 3)) (0.11.0) Requirement already satisfied: scanf>=1.5.2 in /home/anders/.local/lib/python3.10/site-packages (from -r requirements.txt (line 4)) (1.5.2) Requirement already satisfied: pyproj>=3.0.0 in /usr/lib/python3/dist-packages (from -r requirements.txt (line 5)) (3.3.1)

The column to the right is the versions I'm using. I run a bleeding edge system and get new versions all the time, I don't really want to put higher version numbers there than actually required. Could be some other python environment issue too.

atorger commented 1 year ago

That is Shapely==1.8.2 sortedcontainers==2.4.0 geopandas==0.11.0 scanf==1.5.2 pyproj==3.3.1

I don't think requirements includes "default" dependencies though. I run with Python version 3.10.5

atorger commented 1 year ago

I guess a python expert could figure this out quite easily, but I don't really know much about dependency issues unfortunately, only from a generic standpoint that it can be hell in any system :-(

matthiasfeist commented 1 year ago

OK that worked now (at least the split script). So maybe commit these dependencies into the repo?

atorger commented 1 year ago

Yeah I pushed it now

marsip commented 1 year ago

I have updated my java build environment in Windows 10 PC and can now successfully build osmfiles. Before the update my old environment (for shp-files) didn't work.

I followed instructions from: https://geoffboeing.com/2014/09/using-geopandas-windows/

Package Version

attrs 21.4.0 certifi 2021.10.8 click 8.0.3 click-plugins 1.1.1 cligj 0.7.2 colorama 0.4.4 Fiona 1.8.21 GDAL 3.4.3 geopandas 0.12.1 munch 2.5.0 numpy 1.22.2 packaging 21.3 pandas 1.4.0 pip 22.3 pyparsing 3.0.9 pyproj 3.3.1 python-dateutil 2.8.2 pytz 2021.3 Rtree 1.0.0 scanf 1.5.2 Shapely 1.8.2 six 1.16.0 sortedcontainers 2.4.0

java version "1.8.0_321" Java(TM) SE Runtime Environment (build 1.8.0_321-b07) Java HotSpot(TM) 64-Bit Server VM (build 25.321-b07, mixed mode)

matthiasfeist commented 1 year ago

Thanks. In the cloud environment which runs nvdb2osm, I automatically create the python environment using the requirements.txt file. So we need to make sure that this file reflects all the correct versions of all dependencies. Have a look here about how to generate a complete file: https://learnpython.com/blog/python-requirements-file/

atorger commented 1 year ago

My pip3 freeze list outputs more than 200 packets, so I don't think it makes that much sense. I have a very rich python environment as lots of the system uses Python, not just this script. I don't know much about setting up python environments from a Configuration Management perspective as for me so far it has "always worked". I use Debian Linux where the python packets installed are a mix of system wide installation and per user local installation.

atorger commented 1 year ago

For reference, here is my pip3 freeze list, could be useful if you want to know a version for a specific package. However, most of these packages (90+%) are not used by the nvdb2osm script. I don't know how to extract a list that is just for nvdb2osm.

appdirs==1.4.4 apt-xapian-index==0.49 arandr==0.1.10 astroid==2.12.10 asttokens==2.0.5 attrs==22.1.0 autobahn==22.7.1 Automat==20.2.0 Babel==2.10.3 backcall==0.2.0 base58==1.0.3 bcrypt==3.2.0 beautifulsoup4==4.11.1 beniget==0.4.1 Bottleneck==1.3.2 Brotli==1.0.9 cbor==1.0.0 certifi==2022.6.15 cffi==1.15.1 chardet==4.0.0 charset-normalizer==2.0.6 chrome-gnome-shell==0.0.0 click==8.0.3 click-plugins==1.1.1 cligj==0.7.2 colorama==0.4.5 constantly==15.1.0 crit==3.17.1 cryptography==3.4.8 cssselect==1.1.0 cupshelpers==1.0 cycler==0.11.0 Cython==0.29.32 dbus-python==1.3.2 decorator==5.1.1 defcon==0.10.1 defusedxml==0.7.1 deprecation==2.0.7 descartes==1.1.0 devscripts==2.22.2 dill==0.3.4 distro==1.8.0 distro-info==1.1 ecdsa==0.18.0 et-xmlfile==1.0.1 executing==0.8.0 ExifRead==3.0.0 fail2ban==0.11.2 feedparser==6.0.8 Fiona==1.8.21 flatbuffers==2.0.6+dfsg1.3 fontPens==0.2.4 fonttools==4.37.1 fpdf2==2.5.5 fs==2.4.16 gast==0.5.2 gbp==0.9.29 GDAL==3.5.2 geographiclib==2.0 geopandas==0.11.1 geopy==2.2.0 gitinspector==0.4.4 gpg==1.18.0 gpxpy==1.5.0 gyp==0.1 html2text==2020.1.16 html5lib==1.1 hyperlink==21.0.0 idna==3.3 imageio==2.4.1 img2pdf==0.4.4 importlib-metadata==4.12.0 incremental==21.3.0 iniconfig==1.1.1 iotop==0.6 ipython==8.5.0 isort==5.6.4 jdcal==1.0 jedi==0.18.0 Jinja2==3.0.3 kiwisolver==1.3.2 Lasagne==0.2.dev1 lazy-object-proxy==1.7.1 libevdev==0.5 logilab-common==1.8.2 lxml==4.9.1 lz4==4.0.2+dfsg Mako==1.2.2 mapnik==3.0.23 MarkupSafe==2.1.1 matplotlib==3.5.2 matplotlib-inline==0.1.6 mccabe==0.6.1 mercurial==6.2.3 mnemonic==0.19 modernize==0.7 more-itertools==8.10.0 mpi4py==3.1.3 mpmath==0.0.0 msgpack==1.0.3 munch==2.5.0 mypy-extensions==0.4.3 Nik4==1.7.0 nose==1.3.7 ntpsec==1.2.1 numexpr==2.8.3 numpy==1.21.5 odfpy==1.4.2 olefile==0.46 openpyxl==3.0.9 packaging==21.3 pandas==1.3.5 parameterized==0.8.1 parso==0.8.1 passlib==1.7.4 pdfarranger==1.9.1 perf==0.1 pexpect==4.8.0 pickleshare==0.7.5 pikepdf==6.0.0+dfsg Pillow==9.2.0 Pivy==0.6.7 platformdirs==2.5.2 pluggy==1.0.0+repack ply==3.11 progressbar==2.5 prompt-toolkit==3.0.31 protobuf==3.12.4 psycopg2==2.9.4 ptyprocess==0.7.0 pure-eval==0.0.0 py==1.10.0 py-gnuplot==1.1.8 py-ubjson==0.16.1 pyasn1==0.4.8 pyasn1-modules==0.2.8 pycairo==1.20.1 pycparser==2.21 pycups==2.0.1 pycurl==7.45.1 pydot==1.4.2 Pygments==2.12.0 PyGObject==3.42.2 pygpu==0.7.6 PyHamcrest==2.0.3 pyinotify==0.9.6 pyjson5==1.6.1 pylint==2.15.3 PyNaCl==1.5.0 pyOpenSSL==21.0.0 pyparsing==3.0.7 pypng==0.0.20 pyproj==3.4.0 PyQRCode==1.2.1 PyQt5==5.15.7 PyQt5-sip==12.11.0 pysmbc==1.0.23 pytest==7.1.2 python-apt==2.3.0+b2 python-dateutil==2.8.1 python-debian==0.1.48 python-magic==0.4.26 python-snappy==0.5.3 pythran==0.11.0 PyTrie==0.4.0 pytz==2022.4 pyudev==0.22.0 pyxattr==0.7.2 pyxdg==0.27 PyYAML==5.4.1 rawkit==0.6.0 rdp==0.8 requests==2.27.1 Rtree==1.0.1 scanf==1.5.2 scipy==1.8.1 scour==0.38.2 service-identity==18.1.0 sgmllib3k==1.0.0 Shapely==1.8.2 six==1.16.0 sortedcontainers==2.4.0 soupsieve==2.3.2 stack-data==0.0.0 sympy==1.10.1 systemd-python==235 tables==3.7.0 Theano==1.0.5 toml==0.10.2 tomli==2.0.1 tomlkit==0.11.5 traitlets==5.4.0 Twisted==22.4.0 txaio==21.2.1 typing_extensions==4.3.0 u-msgpack-python==2.3.0 ujson==5.5.0 unattended-upgrades==0.1 unicodedata2==15.0.0 unidiff==0.7.3 urllib3==1.26.12 vboxapi==1.0 watchdog==2.1.9 wcwidth==0.2.5 webencodings==0.5.1 Willow==1.4 wrapt==1.14.1 wsaccel==0.6.3 xdg==5 xlwt==1.3.0 youtube-dl==2021.12.17 zipp==1.0.0 zope.interface==5.4.0

matthiasfeist commented 1 year ago

Haha ok that's a long list and I agree that it doesn't make sense to include all of this I'm the requirements file. But somehow I need to have a way to reliably replicate the environment you're running with some sort of machine readable specifications you provide.

matthiasfeist commented 1 year ago

I know a little bit about python programming and generally it's best practice to have a dedicated virtual environment to avoid the situation that you have right now where it's unclear which dependencies the script really has and which other packages that are installed have a unknown effect on the code.

See here for example: https://blog.inedo.com/python-environment-management-best-practices

marsip commented 1 year ago

The list of dependencies I showed, are the only I have in my Phyton environment. (Besides any hidden dependences the basic Phyton environment maybe have).

It should be enough to have these for a successful build.

matthiasfeist commented 1 year ago

Can you try maybe something like this: https://github.com/bndr/pipreqs

the thing is that it's really important that the requirements.txt file is correct and up-to-date if we want this project to work reliably on other people's computers and in the cloud.

anders-xarepo commented 1 year ago

Did that and committed/pushed the new requirements.txt, this was the result:

Fiona==1.8.21 geopandas==0.11.1 pyproj==3.4.0 scanf==1.5.2 Shapely==1.8.5 sortedcontainers==2.4.0

Seems like also pipreqs assumes the existence of some basic Python environment.

(logged in with my work user too lazy to change...)

matthiasfeist commented 1 year ago

So I finally had time to look at this. It was not easy to just test it as Trafikverket had again changed their API so I needed to adjust my scripts as well to be able to download the files.

I've processed now Östergötlands län on the cloud infrastructure (random choice, I know 😆) with the new requirements.txt file that is commited in the repo.

it seems to me that now the splitting of the files is randomly breaking. here's the split logfile for the run: https://nvdb-osm-map-data.s3.eu-north-1.amazonaws.com/split/split_5.log there's a lot of random warnings from Fiona...

marsip commented 1 year ago

I also get lot of errors on splitting with my local build environment (windows 10), but the splitting doesn't abort/exit and the result files are ok. But I have only tested Västernorrlands län.

matthiasfeist commented 1 year ago

I'm currently testing if the amount of errors generates too large log files. I'm saving all the log output so it's possible to debug it. But since Fiona seems to log like crazy some logfiles are getting about 100MB or larger ...

elindberg-snapcode commented 1 year ago

I'm brining this thread back to life since there seems to be some issues with splitting the geo package file (split_nvdb_data). Hopefully, I'm missing something but I've tried everything I can think of so next step is to try a post here :) I can successfully split the old shape files (I had an old one lying around), but when trying the new geo package format I end up with an RLID error of some kind.

"geometry with RLID {row.RLID} not contained nor intersecting with any municipality"

The entire error message is as follows: 2023-04-27 12:17:59,851 - split_nvdb_data - INFO - Reading layer NVDB_DK_O_38_FunkVagklass from file zip://../Vasternorrland_GeoPackage.zip!Västernorrlands_län.gpkg 2023-04-27 12:18:21,935 - split_nvdb_data - INFO - done (93779 segments) Traceback (most recent call last): File "./split_nvdb_data.py", line 295, in <module> main() File "./split_nvdb_data.py", line 269, in main _log.info(f"geometry with RLID {row.RLID} not contained nor intersecting with any municipality") File "/home/likj8457/.local/lib/python3.8/site-packages/pandas/core/generic.py", line 5989, in __getattr__ return object.__getattribute__(self, name) AttributeError: 'Series' object has no attribute 'RLID'

Since Västernorrland was working 5 months ago, I tried to find out if something has happened to the format. And sure enough, it has. I cant say this is the problem... but for someone with insight in the scripts.. it might be a starting point?

Any Ideas?

atorger commented 1 year ago

I'll probably need to change the code again. Trafikverket have had quite a good number of format changes over the past couple of years, some small some large, so nvdb2osm breaks now and then. It's usually quite easy for me to adapt the code to the new format but I need to get time to look into it, and for the moment I have a bit much stuff to do.

atorger commented 1 year ago

Started to look at it, seems like Trafikverket has changed the names of basically all tags... I guess it's good in the long term though, as now they use proper timestamps and tag names that aren't truncated. But there's a lot stuff that needs to be changed in the huge script. I'll probably do the easier way first and just translate new tags back to old names, less risk of introducing errors. Then when things settle down a bit one can ditch support for the old abbreviated tag names.

atorger commented 1 year ago

I have now pushed some patches so it should work again on new NVDB dumps, while still working for the old. It's more than 500 lines of new code, but mostly just tag translations... a few hours just going through all databases. To make minimum impact on existing code to minimize risk of introducing errors, I'm converting new key names back to the old, but in the long term it would be nice to convert all code to use new full length names, as it makes the code easier to read and understand. One example: the old key name was "VAEPROILVA", and the new one is "Vagprofil_tvar_kurva", a little bit easier to understand the longer key names :-)

Apart from changing the names of all keys in all databases, they have also changed some value types so I needed to patch a bit extra for that. Let me know if you run into any problems, I have only done some brief tests on Västernorrland data.

elindberg-snapcode commented 1 year ago

Seems like it did the trick! I just ran Södermanland without any errors, thanks!

atorger / nvdb2osm

Changing format for Lastkajen data #48