valhalla / docker

docker
MIT License
17 stars 22 forks source link

'This file has an incorrect size for type' #42

Closed missinglink closed 5 years ago

missinglink commented 5 years ago

hi team,

I ran into this issue today when trying to generate tiles for a recent planet file (planet-180730.osm.pbf):

+ docker run --rm -i -v /data/valhalla:/tmp/valhalla -v /data/osm.extract.pbf:/tmp/osm.extract.pbf:ro valhalla/docker bash
2018/08/08 16:04:02.251598 [INFO] Parsing files: /tmp/osm.extract.pbf
2018/08/08 16:04:02.252481 [INFO] Parsing ways...
2018/08/08 20:33:27.309283 [INFO] Finished with 122090597 routable ways containing 1341441487 nodes
2018/08/08 20:33:27.310718 [INFO] Sorting osm access tags by way id...
2018/08/08 20:33:27.312495 [INFO] Parsing relations...
2018/08/08 21:14:42.702024 [INFO] Finished with 824396 simple restrictions
2018/08/08 21:14:42.702053 [INFO] Finished with 5 lane connections
2018/08/08 21:14:42.708017 [INFO] Sorting complex restrictions by from id...
2018/08/08 21:14:42.729968 [INFO] Sorting osm way node references by node id...
terminate called after throwing an instance of 'std::runtime_error'
  what():  This file has an incorrect size for type
bash: line 1:     6 Aborted                 (core dumped) valhalla_build_tiles -c /tmp/valhalla/valhalla.json /tmp/osm.extract.pbf

I confirmed the checksum is correct using md5sum so that's not the issue, any ideas?

I've also managed to successfully complete the same exact workflow (literally only substituting a smaller PBF file) with a us-northeast extract from geofabrik, so I suspect it's something to do with the size of the extract.

This is running in docker using the command at the top of the output I pasted, it's running on an i3.large in AWS, which has 15.25 GiB RAM and more than enough SSD.

I'll try running it again on a machine with more RAM to see if that helps.

kevinkreiser commented 5 years ago

This message means that one of the *.bin files we wrote out had the wrong size. Did you by chance run out of disk thereby causing the file to be truncated? oh i see you said more than enough ssd... i think we need something like 100 or maybe even 150 gb of disk to do the planet. do you remember @gknisely ?

missinglink commented 5 years ago

I really doubt it's disk, the machine comes with 475GB of SSD, although it's possible.

That disk is holding the original planet download at 42GB but nothing else.

The script went on to perform two additional steps (not related to Valhalla) which produced another ~10GB of files, so I would have expected those script to also fail if the disk was full.

It's possible that docker is only allowing a certain amount of disk to be available, I'm using the default settings and I've not heard of that being an issue before.

What's a normal peak memory usage for a full planet import? Could this error be triggered due to lack of RAM or only by disk errors?

missinglink commented 5 years ago

I've upgraded the machine to i3.xlarge | 4 core | 30.5 GiB | 0.950 TB SSD and I'm re-running it now.

missinglink commented 5 years ago

oh you know what it might be? /var/lib/docker is on a smaller disk (on the host machine) and it's possibly filled that disk up.

would the docker image write to /tmp in the container?

missinglink commented 5 years ago

here is my complete script:

#!/bin/bash
set -x

# pull the docker image
docker pull ${valhalla_docker_image}

# ensure data dir and tile dir exists
mkdir -p ${valhalla_data_path}
mkdir -p ${valhalla_data_path}/tiles

# generate valhalla config
docker run --rm -i -v "${valhalla_data_path}:/tmp/valhalla" ${valhalla_docker_image} bash <<-EOF
  valhalla_build_config \
    --mjolnir-tile-dir /tmp/valhalla/tiles \
    --mjolnir-tile-extract /tmp/valhalla/valhalla.tar \
    --mjolnir-timezone /tmp/valhalla/timezones.sqlite \
    --mjolnir-admin /tmp/valhalla/admins.sqlite > /tmp/valhalla/valhalla.json
EOF

# generate valhalla tiles
docker run --rm -i -v "${valhalla_data_path}:/tmp/valhalla" -v "${osm_pbf_local_path}:/tmp/osm.extract.pbf:ro" valhalla/docker bash <<-EOF
  valhalla_build_tiles -c /tmp/valhalla/valhalla.json /tmp/osm.extract.pbf
EOF

# generate polylines file
docker run --rm -i -v "${valhalla_data_path}:/tmp/valhalla" valhalla/docker bash <<-EOF
  valhalla_export_edges --config /tmp/valhalla/valhalla.json > /tmp/valhalla/${valhalla_polyline_extract_filename}
EOF

I'm mounting a directory from the host machine and making it available to the container as /tmp/valhalla.

This directory should be sufficiently large to hold all the data, but I'm assuming that no other directory is being written to, that's probably not correct huh?

missinglink commented 5 years ago

If something is writing to /tmp I can fix it pretty easily by shifting TMP/TEMP/TMPDIR/TEMPDIR to the large disk with a workaround such as:

mkdir -p ${valhalla_data_path}/tmp

export TMP=${valhalla_data_path}/tmp

valhalla_build_tiles -c /tmp/valhalla/valhalla.json /tmp/osm.extract.pbf
kevinkreiser commented 5 years ago

So the process currently writes these *.bin files to the current working directory is there enough space there?

missinglink commented 5 years ago

In the current working dir? ugh, probably not but I can cd :)

missinglink commented 5 years ago

Is there possibly an ENV var such as TMP or a config var I can use to control which locations Valhalla is able to use for temp files / bin files?

kevinkreiser commented 5 years ago

there should be, but currently there isnt... it just puts them in the CWD where ever the program is being run from. :(

missinglink commented 5 years ago

ok cool, that's super helpful and will probably fix my issue, I will open a separate issue in the main repo that describes this issue more succinctly :)

kevinkreiser commented 5 years ago

if we want this feature we should:

missinglink commented 5 years ago

For posterity, the issue was indeed with files being generated in the current working directory:

total 93G
drwxr-xr-x 2 root root 4.0K Aug 10 15:09 .
drwxr-xr-x 4 root root 4.0K Aug  9 21:34 ..
-rw-r--r-- 1 root root  97M Aug 10 02:42 access.bin
-rw-r--r-- 1 root root  13M Aug 10 02:42 complex_restrictions.bin
-rw-r--r-- 1 root root   68 Aug 10 08:45 duplicateways.txt
-rw-r--r-- 1 root root 5.9G Aug 10 06:42 edges.bin
-rw-r--r-- 1 root root  31K Aug 10 02:00 loop_ways.txt
-rw-r--r-- 1 root root 3.5G Aug 10 15:12 new_nodes_to_old_nodes.bin
-rw-r--r-- 1 root root  15G Aug 10 06:30 nodes.bin
-rw-r--r-- 1 root root 7.5G Aug 10 15:14 old_nodes_to_new_nodes.bin
-rw-r--r-- 1 root root  50G Aug 10 06:16 way_nodes.bin
-rw-r--r-- 1 root root  12G Aug 10 02:42 ways.bin

once I moved to a larger drive with cd before executing valhalla_build_tiles, the build worked fine.

it was not required to set any environment variables.