Closed kochis closed 1 year ago
You input files are broken. Or maybe they are just not intended to be merged. Numerous objects with the same id and version number but different data are in both files. You have to take this up with the daylight people.
For instance these, first from the admin file, second from the roads file:
n11655434464 v1 dV c1000000000 t2023-02-04T00:00:00Z i9848585 urapidassist T x39.7447377 y64.5523915
n11655434464 v1 dV c1000000000 t2023-02-04T00:00:00Z i9848585 urapidassist T x-56.6986477 y-30.3889966
This is something that should never happen in normal use of OSM data, so Osmium doesn't detect that case but happily creates broken data from the broken input data.
I would expect renumbered files to be slightly smaller due to the more efficient packing of ids. And no, this operation should never loose data (unless your data is broken to begin with in which case all bets are off). You can check with osmium fileinfo -e
whether you have the same number of objects in both files.
Thanks, that's sort of what I was afraid of (I think fb-ml-roads being the main culprit here).
Out of curiosity, how did you find the overlapping nodes in the file?
Out of curiosity, how did you find the overlapping nodes in the file?
osmium fileinfo -e
shows minimum and maximum ids. I could see there was an overlap, so I write a small pyosmium script that read out all objects in the overlapping range. I then used osmium getid
to find those in the other file.
Out of curiosity, how did you find the overlapping nodes in the file?
I saw @joto 's script: https://github.com/osmcode/osmium-tool/issues/197#issuecomment-694242042
osmium cat test_at1_sorted.pbf -f opl | cut -d' ' -f1 | uniq -c | grep -v ' 1 '
What version of osmium-tool are you using?
What operating system version are you using?
MacOS Ventrua 13.1
Tell us something about your system
Apple M1 Max 64GB RAM
What did you do exactly?
Merged two Daylight OSM changesets into a single file.
Then renumbered them, starting with the largest ID found in the planet dataset.
What did you expect to happen?
Expected the output file to contain the additional data included in the change files.
What did happen instead?
Got an error indicating duplicate Node IDs (and stopped processing the file).
What did you do to try analyzing the problem?
It's likely my misunderstanding with how
merge-changes
should work, but I assumed any items with duplicate IDs across the files would be merged into the latest occurrence. Is it possible for duplicate IDs across nodes to still exist after merging?I guess I'm also having trouble understanding the difference between
sort
,merge
, andmerge-changes
, and how they should be used together for applying changesets.I'm also sometimes noticing smaller filesizes after running the renumber command, which I assume is due to using smaller IDs, but just wanted to verify that's not a destructive command (it wouldn't drop nodes in any scenario?)
Apologies if this is more of a question than an issue, feel free to close and redirect to the correct place if needed. Thanks.