osmcode / osmium-tool

Command line tool for working with OpenStreetMap data based on the Osmium library.
https://osmcode.org/osmium-tool/
GNU General Public License v3.0
483 stars 104 forks source link

file size change with apply-changes #247

Closed dieterdreist closed 2 years ago

dieterdreist commented 2 years ago

What version of osmium-tool are you using?

osmium version 1.14.0 libosmium version 2.18.0 Supported PBF compression types: none zlib lz4

What operating system version are you using?

macOS 12.3.1 (21E258)

Tell us something about your system

What did you do exactly?

osmium apply-changes planet-latest.osm.pbf ${CHANGES} -o ${PREFIX}-planet.osm.pbf

where changes is a string which lists 14 osc.gz files, totalling 1.1 GB

What did you expect to happen?

moderate increase in file size

What did happen instead?

planet grew from 63GB to 66GB

Maybe this is expected?

dieterdreist commented 2 years ago
osmium fileinfo -e 220513-planet.osm.pbf
File:
  Name: 220513-planet.osm.pbf
  Format: PBF
  Compression: none
  Size: 70854841060
Header:
  Bounding boxes:
  With history: no
  Options:
    generator=osmium/1.14.0
    pbf_dense_nodes=true
[======================================================================] 100% 
Data:
  Bounding box: (-180,-90,180,90)
  Timestamps:
    First: 2005-05-03T13:27:18Z
    Last: 2022-05-12T23:59:22Z
  Objects ordered (by type and id): yes
  Multiple versions of same object: no
  CRC32: not calculated (use --crc/-c to enable)
  Number of changesets: 0
  Number of nodes: 7681064403
  Number of ways: 858426049
  Number of relations: 9901076
  Smallest changeset ID: 0
  Smallest node ID: 1
  Smallest way ID: 37
  Smallest relation ID: 11
  Largest changeset ID: 0
  Largest node ID: 9736075728
  Largest way ID: 1059705347
  Largest relation ID: 14134073
  Number of buffers: 12098575 (avg 706 objects per buffer)
  Sum of buffer sizes: 760890019528 (725.641 GB)
  Sum of buffer capacities: 793243811840 (756.496 GB, 96% full)
Metadata:
  All objects have following metadata attributes: version+timestamp+changeset
  Some objects have following metadata attributes: all
joto commented 2 years ago

The planet you can download on planet.osm.org is generated with a program that doesn't use libosmium. It encodes OSM data slightly different than how libosmium does it. Also we don't know what those changes are, they might well result in some objects moving into a different block inside the PBF or something like that making the whole file larger. So without looking at the details it is well possible that the file you get is somewhat larger than expected. I wouldn't consider this a bug.

dieterdreist commented 2 years ago

the diffs are the daily diffs from OpenStreetMap 2/5-today

dieterdreist commented 2 years ago

after continuing with the updates to the file I have found that the initial increase is now stable (not growing unexpectedly any more) which kind of confirms that the size differences are likely due to a different way of saving the pbf and not a problem.