systemed / tilemaker

Make OpenStreetMap vector tiles without the stack
https://tilemaker.org/
Other
1.44k stars 229 forks source link

mbtiles file bigger than bare folders? #734

Open dieterdreist opened 1 month ago

dieterdreist commented 1 month ago

I noticed something strange, when creating an mbtiles file for a small area, I get a file with 6.4MB, but when I create files in a folderstructure, du -hc says it is 5.3MB. Is there some overhead with mbtiles files, i.e. is this expected? I didn't touch other settings (compression etc.)

systemed commented 1 month ago

Seems odd. Could you provide steps to reproduce (i.e. command line and URL of extract)?

dieterdreist commented 1 month ago

I have checked it, and the sizes are the same regardless of the creation process (extraction from mbtiles or direct creation of the files into a directory with tilemaker). you can find the file here: http://img.23m.it/corviale.mbtiles it has 6488064 Bytes. I extracted the tiles like this: mb-util --image_format=pbf corviale.mbtiles tilesExtracted then I do du -hc tilesExtracted to get the total size and it says 5.3M

Comparing blocksize, the mbtiles file is 14464 blocks and the grand total of the extracted tiles is 10768 blocks.

ImreSamu commented 1 month ago

@dieterdreist

IMHO:
The approximately 20% overhead is likely due to fixed page sizes, indexes, and metadata. You may be able to optimize this by using a different page size.

$ sqlite3_analyzer corviale.mbtiles 
/** Disk-Space Utilization Report For corviale.mbtiles

Page size in bytes................................ 65536     
Pages in the whole file (measured)................ 99        
Pages in the whole file (calculated).............. 99        
Pages that store data............................. 99         100.0% 
Pages on the freelist (per header)................ 0            0.0% 
Pages on the freelist (calculated)................ 0            0.0% 
Pages of auto-vacuum overhead..................... 0            0.0% 
Number of tables in the database.................. 3         
Number of indices................................. 2         
Number of defined indices......................... 1         
Number of implied indices......................... 1         
Size of the file in bytes......................... 6488064   
Bytes of user payload stored...................... 5278996     81.4% 

*** Page counts for all tables with their indices *****************************

TILES............................................. 96          97.0% 
METADATA.......................................... 2            2.0% 
SQLITE_SCHEMA..................................... 1            1.0% 

....

*** Definitions ***************************************************************

Page size in bytes

    The number of bytes in a single page of the database file.  
    Usually 1024.

Number of pages in the whole file

    The number of 65536-byte pages that go into forming the complete
    database

Pages that store data

    The number of pages that store data, either as primary B*Tree pages or
    as overflow pages.  The number at the right is the data pages divided by
    the total number of pages in the file.

Pages on the freelist

    The number of pages that are not currently in use but are reserved for
    future use.  The percentage at the right is the number of freelist pages
    divided by the total number of pages in the file.

Pages of auto-vacuum overhead

    The number of pages that store data used by the database to facilitate
    auto-vacuum. This is zero for databases that do not support auto-vacuum.

Number of tables in the database

    The number of tables in the database, including the SQLITE_SCHEMA table
    used to store schema information.

Number of indices

    The total number of indices in the database.

Number of defined indices

    The number of indices created using an explicit CREATE INDEX statement.

Number of implied indices

    The number of indices used to implement PRIMARY KEY or UNIQUE constraints
    on tables.

Size of the file in bytes

    The total amount of disk space used by the entire database files.

Bytes of user payload stored

    The total number of bytes of user payload stored in the database. The
    schema information in the SQLITE_SCHEMA table is not counted when
    computing this number.  The percentage at the right shows the payload
    divided by the total file size.

...

$ sqlite3 corviale.mbtiles
SQLite version 3.46.0 2024-05-23 13:25:27
Enter ".help" for usage hints.

sqlite> SELECT name, sum(pgsize) AS size FROM dbstat GROUP BY name
  ORDER BY size DESC;
tiles|6225920
tile_index|65536
sqlite_schema|65536
sqlite_autoindex_metadata_1|65536
metadata|65536

sqlite> .schema
CREATE TABLE metadata (name text, value text, UNIQUE (name));
CREATE TABLE tiles (zoom_level integer, tile_column integer, tile_row integer, tile_data blob);
CREATE UNIQUE INDEX tile_index on tiles (zoom_level, tile_column, tile_row);
sqlite>