zerebubuth / planet-dump-ng

Converts an OpenStreetMap database dump into planet files.
BSD 2-Clause "Simplified" License
30 stars 8 forks source link

update readme to the current state #21

Closed matkoniecz closed 3 years ago

matkoniecz commented 5 years ago

I may be mistaken but from what I see now this repo is current.

BTW, is https://wiki.openstreetmap.org/wiki/Planet.osm

Note that planet download have ways that reference nodes that are not in the same file. Due to performance reasons it isn't possibly to get a fully consistent snapshot of the database. Although the dump is run in a transaction, the isolation level required for a "snapshot"-style dump dramatically increases the running time

section now outdated? I am basing this guess on

The previous version of this tool required the database server to keep a consistent transaction context open for the duration of the dump, which would usually be several days.

matkoniecz commented 3 years ago

ping?

zerebubuth commented 3 years ago

I'm very sorry! It has been so long - apologies for the delay.

You are right, the section in the wiki that you quoted should only apply to archived planet files generated with old software. The current version of the software should output planet files where all the nodes referenced by ways are present in the file.

I think I understand the change you made to the README, the way that bit is worded is confusing... and I've used some jargon. Perhaps I can explain it better...

What I was trying to explain in the README is that planet-dump-ng doesn't generate the planet file from the PostgreSQL table of the most recent non-deleted elements (which I've called the "current" tables, after their names in the database schema, current_nodes, current_ways, etc...). Instead, it generates the "historical" planet file, which is a planet file containing all versions of all elements, even deleted ones, and then filters the "historical" planet file to extract only the most recent non-deleted versions of elements. (It's not actually a 2-step process, it happens in a streaming fashion.)

In that section, when I say that there's an adjustment to how the "current" planet file is written, I meant the planet file with only the most recent, non-deleted elements, rather than a planet file written recently in time.

Do you think it would make it clearer to replace that paragraph with:

In order that the system can output a planet file or a history planet file in the same run, both are generated from the history tables. The history planet file contains all these versions, but the non-history (sometimes called "current") planet file does not. This requires a minor adjustment to how the non-history planet is written, with a filter which only keeps the most recent version of each element and does not output any elements which are flagged as deleted.

?

matkoniecz commented 3 years ago

Do you think it would make it clearer to replace that paragraph with:

Yes, I think it is more clear.

Maybe changing

but the non-history (sometimes called "current") planet file

to

but the planet file without history data ("current")

would be also improvement (if not - feel free to ignore it)?

matkoniecz commented 3 years ago

replaced by #22