tum-gis / cesium-terrain-builder-docker

Dockerfile for the geo-data/cesium-terrain-builder app with quantized mesh support.
Apache License 2.0
204 stars 51 forks source link

can we automate handling large datasets? #23

Open mhaberler opened 1 year ago

mhaberler commented 1 year ago

you outlined the overflow issues at https://github.com/tum-gis/cesium-terrain-builder-docker#handling-large-datasets

do you see a automatic or scripting solution to this issue - even if it runs rather long?

manual try and error ist just pretty tedious

BWibo commented 1 year ago

Hey there, I actually have a more or less complete script to do that. I'll check next week if I can find it. I never added this here, because it's rather hacky and I don't have time to maintain this. In general this is not very hard to do.

mhaberler commented 1 year ago

If you find it I'd appreciate a copy nevertheless

BWibo commented 1 year ago

Here it is. With some small Adaption this should work. I did not test now, so use with caution. If you have it running and tested, please add your copy here. I might add it to the repo then.

#!/usr/bin/env bash

# CTB for large datasets ------------------------------------------------------
STARTLEVEL=18
ENDLEVEL=0

INPUTFILE="*.tif"
INPUTFOLDER=_gtif_wgs84
OUTPUTFOLDER_TERRAIN=terrain
CTBOPTS="-f Mesh -C -N"
TEMPDIR=temp
OUTPUTFOLDER_GTIF="$TEMPDIR/gtif_tiles"

# functions -------------------------------------------------------------------
makeGDALTileset() {
    # $1    Zoom level
    # $2    Input raster
    # $3    Output folder
    printf "\nCreating GDAL tileset for level $1 from $2...\n"
    local startTime=$(($(date +%s%N)))
    ctb-tile -t 65 -f GTiff -o $3 -s $1 -e $1 "$2"
    local endTime=$(($(date +%s%N)))
    local elapsedTime=$(( ($endTime - $startTime) / 100000 ))
    printf "Creating GDAL tileset for level $1 from $2...done!\t$elapsedTime ms\n"
}

makeTerrainTiles() {
    # $1    Zoom level
    # $2    Input raster
    # $3    Output folder
    # $4    CTB options

    printf "\nCreating terrain tileset for zoom level $1 from $2...\n"
    local startTime=$(($(date +%s%N)))
    ctb-tile $CTBOPTS -o $3 -s $1 -e $1 $2
    local endTime=$(($(date +%s%N)))
    local elapsedTime=$(( ($endTime - $startTime) / 100000 ))
    printf "Creating terrain tileset for zoom level $1 from $2...done!\t$elapsedTime ms\n"
}

makeTerrainTilesLayerJSON() {
    # $1    input raster
    printf "\nCreate terrain tileset layer JSON from $1 for zoom level $ENDLEVEL to $STARTLEVEL.\n"
    ctb-tile $CTBOPTS -l -o "$OUTPUTFOLDER_TERRAIN" -s $STARTLEVEL -e $ENDLEVEL "$1"
}

makeVRTFromFileList() {
    # $1    input file list
    # $2    output vrt

    printf "\nCreating virtual raster $2 from $1...\n"
    local startTime=$(($(date +%s%N)))
    gdalbuildvrt "${2}" -input_file_list "${1}"
    local endTime=$(($(date +%s%N)))
    local elapsedTime=$(( ($endTime - $startTime) / 100000 ))
    printf "Creating virtual raster $2 from $1...done!\t$elapsedTime ms\n"
}

makeFileList() {
    # $1        Input folder
    # $2        Input file or file pattern
    # $3        output file

    find "$1" -type f -name "$2" -print > "$3"
}

cleanup() {
    printf "\nCleaning up...\n"

    # cleanup temp folder
    if [ -d  "$TEMPDIR" ]; then
        temp=$( du -a "$TEMPDIR" | wc -l )
        rm -rfv "$TEMPDIR" | pv -l -s "$temp" > /dev/null
    fi

    if [ -d  "$OUTPUTFOLDER_TERRAIN" ]; then
        temp=$( du -a "$OUTPUTFOLDER_TERRAIN" | wc -l )
        rm -rfv "$OUTPUTFOLDER_TERRAIN" | pv -l -s "$temp" > /dev/null
    fi

    printf "\nCleaning up...done!\n"
}

# main ------------------------------------------------------------------------

# cleanup and preparation
cleanup

# create folder structure
mkdir -p $TEMPDIR $OUTPUTFOLDER_GTIF $OUTPUTFOLDER_TERRAIN

# start timer
startTime=$(($(date +%s%N)))
printf "\nCreating Quantized Mesh tiles...\n"

# create STARTLEVEL terrain tileset from original input data
# create input file list and vrt
inputFileList="$TEMPDIR/input_files.txt"
inputVRT="$TEMPDIR/input.vrt"
makeFileList "$INPUTFOLDER" "$INPUTFILE" "$inputFileList"
makeVRTFromFileList "$inputFileList" "$inputVRT"

# create layer.json
# makeTerrainTilesLayerJSON "$inputVRT"

# create start level file list, vrt, gdal tiles, cesium tiles
makeGDALTileset $STARTLEVEL "$inputVRT" "$OUTPUTFOLDER_GTIF"
makeFileList "$OUTPUTFOLDER_GTIF/$STARTLEVEL" "*.tif" "$TEMPDIR/${STARTLEVEL}_filelist.txt"
makeVRTFromFileList "$TEMPDIR/${STARTLEVEL}_filelist.txt" "$TEMPDIR/$STARTLEVEL.vrt"
makeTerrainTiles $STARTLEVEL "$TEMPDIR/$STARTLEVEL.vrt" "$OUTPUTFOLDER_TERRAIN"
# makeTerrainTiles $STARTLEVEL "$inputVRT" "$OUTPUTFOLDER_TERRAIN"

# create additional levels using GDAL tilesets
STARTLEVEL=$(($STARTLEVEL -1))
for i in $(seq $STARTLEVEL -1 $ENDLEVEL);
do
    curLvl=$i
    lastLvl=$(($i + 1 ))

    printf "\n\nCurrent Level $curLvl, lastlevel $lastLvl ---------------------\n\n"

    # Build VRT for current level
    curFileList="$TEMPDIR/${curLvl}_filelist.txt"
    curVRT="$TEMPDIR/$curLvl.vrt"
    lastVRT="$TEMPDIR/$lastLvl.vrt"

    # Create GDAL tileset for current level from last level tile set
    makeGDALTileset $curLvl "$lastVRT" "$OUTPUTFOLDER_GTIF"
    #makeGDALTileset $curLvl "$inputVRT" "$OUTPUTFOLDER_GTIF"

    makeFileList "$OUTPUTFOLDER_GTIF/$curLvl" "*.tif" "$curFileList"
    makeVRTFromFileList "$curFileList" "$curVRT"

    # Construct the terrain of the current level using VRT of the last level: 
    makeTerrainTiles $curLvl "$lastVRT" "$OUTPUTFOLDER_TERRAIN"
done

# calc and print time elapsed
endTime=$(($(date +%s%N)))
elapsedTime=$(( ($endTime - $startTime) / 1000000 ))

printf "\n\nCreating Quantized Mesh tiles...done!\t$elapsedTime ms\n"

# gzip terrain
printf "\nCompressing Quantized Mesh tiles..."
startTime=$(($(date +%s%N)))

tar cf "terrain.tar" "$OUTPUTFOLDER_TERRAIN"
#tar czf "terrain.tar.gz" "$OUTPUTFOLDER_TERRAIN"

endTime=$(($(date +%s%N)))
elapsedTime=$(( ($endTime - $startTime) / 100000 ))
printf "\nCompressing Quantized Mesh tiles...done!\t$elapsedTime ms\n\n"

exit 0
mhaberler commented 1 year ago

thanks!

kakadiyaAnkit commented 1 month ago

Hey @BWibo 🙋‍♂️

Can you please tell me logic behind selecting STARTLEVEL = 18? Is this number coming from any calculation or you just happenes to know the maxzoom lvl for your tiff.

BWibo commented 1 month ago

Hey there, I did this long ago. As far as I remember there is no real logic behind that choice. Level 18 offers acceptable resolution and is still relatively quick to compute. Set the value depending on your dataset and use case.