Open 7yl4r opened 5 years ago
Heyyyyy neat: creating a virt-dataset then using gdaladdo
(1) worked!
Here's the west FL Peninsula rendered w/ nearest-neighbor downsampled by a factor of 2048:
Took about 40m:
[root@thing2 west_fl_pen]# time gdaladdo -ro wv2_classif_2019-08-15.vrt 2048
0...10...20...30...40...50...60...70...80...90...100 - done.
real 41m34.946s
user 28m5.781s
sys 1m45.954s
gdaladdo
gets you to the grayscale geotiff anyway.
To get the pseudocolor png above I loaded the file into a QGIS project and then did a "snapshot" w/ the CLI:
tylar@tylardesk:~/tmp$ qgis --noplugins --snapshot test.png --project ./testwv.qgz
I'm not yet sure how to automate the creation of a QGIS project.
Perhaps better than using QGIS for this step would be to use rasterio to modify the geotiff example.
Note also that gdal2tiles
is the perfect utility for generating rgb png tiles from one large geotiff.
Just for fun let's try making a giant mosaic:
[root@thing2 west_fl_pen]# time gdaladdo -ro -r average wv2_classif_2019-08-15.vrt 1
ERROR 6: The overview file would be larger than 4GB
but this is the largest size a TIFF can be, and BigTIFF is unavailable.
Creation failed.
Overview building failed.
real 0m0.070s
user 0m0.031s
sys 0m0.012s
[root@thing2 west_fl_pen]# time gdaladdo -ro -r average wv2_classif_2019-08-15.vrt 2
ERROR 6: The overview file would be larger than 4GB
but this is the largest size a TIFF can be, and BigTIFF is unavailable.
Creation failed.
Overview building failed.
real 0m0.039s
user 0m0.028s
sys 0m0.010s
[root@thing2 west_fl_pen]# time gdaladdo -ro -r average wv2_classif_2019-08-15.vrt 4
0...10...20...30...40...50...60...70...80...90...100 - done.
real 80m3.317s
user 61m59.031s
sys 4m2.996s
[root@thing2 ~]# ls -lh /thing2/imars-objects/west_fl_pen/wv2_classif_2019-08-15.vrt.ovr
-rw-r--r--. 1 root root 2.8G Aug 16 22:32 /thing2/imars-objects/west_fl_pen/wv2_classif_2019-08-15.vrt.ovr
Neat!
Example tile generation:
[root@thing2 west_fl_pen]# time gdal2tiles.py -p raster wv2_classif_2019-08-15.vrt.ovr wv2_classif_tiles
Generating Base Tiles:
0...10...20...30...40...50...60...70...80...90...100 - done.
Generating Overview Tiles:
0...10...20...30...40...50...60...70...80...90...100 - done.
real 8m33.702s
user 6m26.070s
sys 0m30.779s
[root@thing2 west_fl_pen]# du -sh wv2_classif_tiles/
318M wv2_classif_tiles/
This is all going really surprisingly smoothly. The only hurdle remaining is to enable BigTiff so we can mosaic at 1:1 scale.
creating mosaic for circe images:
[root@cozumel imars_objects]# time gdalbuildvrt circe-ingest_tx_2019-10.vrt ./ftp-ingest/circe_ingest/*.tif
0...10...20...30ERROR 4: `./ftp-ingest/circe_ingest/WV02_20120902174937_103001001B8B5700_12SEP02174937-M1BS-500076325020_01_P017_NSF_SWTX_Map_filt_2_Cloudy.tif' not recognised as a supported file format.
Warning 1: Can't open ./ftp-ingest/circe_ingest/WV02_20120902174937_103001001B8B5700_12SEP02174937-M1BS-500076325020_01_P017_NSF_SWTX_Map_filt_2_Cloudy.tif. Skipping it
ERROR 4: `./ftp-ingest/circe_ingest/WV02_20120902174939_103001001B8B5700_12SEP02174939-M1BS-500076325020_01_P019_NSF_SWTX_Map_filt_2_Cloudy.tif' not recognised as a supported file format.
Warning 1: Can't open ./ftp-ingest/circe_ingest/WV02_20120902174939_103001001B8B5700_12SEP02174939-M1BS-500076325020_01_P019_NSF_SWTX_Map_filt_2_Cloudy.tif. Skipping it
...40...50...60...70...80...90...100 - done.
real 0m0.303s
user 0m0.209s
sys 0m0.057s
[root@cozumel imars_objects]# time gdaladdo -ro circe-ingest_tx_2019-10.vrt 1
0...10...20...30...40...50...60...70...80...90...100 - done.
real 7m0.876s
user 6m36.489s
sys 0m14.799s
Resulting file is 16GB file.
A quick glance at it and it looks pretty good in QGIS, although I haven't updated my colormap to match the latest. :+1:
Using gdal_merge seems to work as well too, although the resulting file get created at out.tif
something else (what?) gets created at circe-ingest_tx_2019-10_merge.tif
.
[root@cozumel imars_objects]# time gdal_merge.py -pct -init 0 circe-ingest_tx_2019-10_merge.tif ./ftp-ingest/circe_ingest/*.tif
ERROR 4: `./ftp-ingest/circe_ingest/WV02_20120902174937_103001001B8B5700_12SEP02174937-M1BS-500076325020_01_P017_NSF_SWTX_Map_filt_2_Cloudy.tif' not recognised as a supported file format.
ERROR 4: `./ftp-ingest/circe_ingest/WV02_20120902174939_103001001B8B5700_12SEP02174939-M1BS-500076325020_01_P019_NSF_SWTX_Map_filt_2_Cloudy.tif' not recognised as a supported file format.
0...10...20...30...40...50...60...70...80...90...100 - done.
real 11m51.958s
user 7m5.857s
sys 2m30.509s
Doh: my arguments weren't right last time. This is better:
[root@cozumel imars_objects]# time gdal_merge.py -pct -init 0 -of tif -ot Byte -o circe-ingest_tx_2019-10_merge_byte.jpg ./ftp-ingest/circe_ingest/*.tif
Trying to optimize the output image so it isn't too big to load into ArcMap. Raw BigTiff is 18G.
I tried forcing the pixel depth with -ot Byte
expecting it might be smaller, but nope.
Tried outputing to png, but got an error:
Format driver png does not support creation and piecewise writing.
Please select a format that does, such as GTiff (the default) or HFA (Erdas Imagine).
Maybe we can lean on that old mainstay imageMagick convert:
[root@cozumel imars_objects]# time convert circe-ingest_tx_2019-10_merge_byte.tif /cozumel/circe-ingest_tx_2019-10_merge_byte.png
convert: Unknown field with tag 33550 (0x830e) encountered. `TIFFReadDirectory' @ warning/tiff.c/TIFFWarnings/820.
convert: Unknown field with tag 33922 (0x8482) encountered. `TIFFReadDirectory' @ warning/tiff.c/TIFFWarnings/820.
convert: Unknown field with tag 34735 (0x87af) encountered. `TIFFReadDirectory' @ warning/tiff.c/TIFFWarnings/820.
convert: Unknown field with tag 34736 (0x87b0) encountered. `TIFFReadDirectory' @ warning/tiff.c/TIFFWarnings/820.
convert: Unknown field with tag 34737 (0x87b1) encountered. `TIFFReadDirectory' @ warning/tiff.c/TIFFWarnings/820.
convert: unable to open pixel cache `/cozumel/circe-ingest_tx_2019-10_merge_byte.png': No space left on device @ error/cache.c/OpenPixelCache/4093.
convert: memory allocation failed `/cozumel/circe-ingest_tx_2019-10_merge_byte.png' @ error/png.c/WriteOnePNGImage/8440.
real 18m12.162s
user 8m4.343s
sys 3m57.412s
...bah. not easily. That command ate up the full ~80GB of HDD space (but not RAM). I tried putting the output file on cozumel's RAID instead, but it still filled the root system disk. It might be possible to figure out where this cache is being stored and mount some more space there... but to do that I need to pin down where imageMagick is putting it's cache.
More detail about the image produced by gdal_merge:
[root@cozumel imars_objects]# identify -verbose circe-ingest_tx_2019-10_merge_byte.tif
Image: circe-ingest_tx_2019-10_merge_byte.tif
Format: TIFF64 (Tagged Image File Format (64-bit))
Class: DirectClass
Geometry: 78097x241161+0+0
Resolution: 72x72
Print size: 1084.68x3349.46
Units: PixelsPerInch
Type: Grayscale
Base type: Grayscale
Endianess: MSB
Colorspace: Gray
Depth: 8-bit
Channel depth:
gray: 8-bit
Channel statistics:
Gray:
min: 0 (0)
max: 66 (0.258824)
mean: 13.8799 (0.0544311)
standard deviation: 24.498 (0.0960704)
kurtosis: 0.0232111
skewness: 1.36115
Colors: 12
Histogram:
13502391534: ( 0, 0, 0) #000000 gray(0,0,0)
134277303: ( 1, 1, 1) #010101 gray(1,1,1)
179070096: ( 10, 10, 10) #0A0A0A gray(10,10,10)
1052597202: ( 20, 20, 20) #141414 gray(20,20,20)
1033770155: ( 50, 50, 50) #323232 gray(50,50,50)
677571603: ( 60, 60, 60) #3C3C3C gray(60,60,60)
1623399: ( 61, 61, 61) #3D3D3D gray(61,61,61)
35267457: ( 62, 62, 62) #3E3E3E gray(62,62,62)
64867990: ( 63, 63, 63) #3F3F3F gray(63,63,63)
241259511: ( 64, 64, 64) #404040 gray(64,64,64)
1861478031: ( 65, 65, 65) #414141 gray(65,65,65)
49776336: ( 66, 66, 66) #424242 gray(66,66,66)
Rendering intent: Undefined
Gamma: 1
Interlace: None
Background color: gray(255,255,255)
Border color: gray(223,223,223)
Matte color: gray(189,189,189)
Transparent color: gray(0,0,0)
Compose: Over
Page geometry: 78097x241161+0+0
Dispose: Undefined
Iterations: 0
Compression: None
Orientation: TopLeft
Properties:
date:create: 2019-10-28T18:31:50+00:00
date:modify: 2019-10-28T18:31:50+00:00
signature: 9f3827f1c6efaceac6ae368bcc03956731c47b31e29ea2c7911a6314d6aab8bd
tiff:endian: lsb
tiff:photometric: min-is-black
tiff:rows-per-strip: 1
Artifacts:
filename: circe-ingest_tx_2019-10_merge_byte.tif
verbose: true
Tainted: False
Filesize: 18.838GB
Number pixels: 18.834G
Pixels per second: 99.63MB
User time: 98.630u
Elapsed time: 3:10.029
Version: ImageMagick 6.7.8-9 2016-06-16 Q16 http://www.imagemagick.org
identify: Unknown field with tag 33550 (0x830e) encountered. `TIFFReadDirectory' @ warning/tiff.c/TIFFWarnings/820.
identify: Unknown field with tag 33922 (0x8482) encountered. `TIFFReadDirectory' @ warning/tiff.c/TIFFWarnings/820.
identify: Unknown field with tag 34735 (0x87af) encountered. `TIFFReadDirectory' @ warning/tiff.c/TIFFWarnings/820.
identify: Unknown field with tag 34736 (0x87b0) encountered. `TIFFReadDirectory' @ warning/tiff.c/TIFFWarnings/820.
identify: Unknown field with tag 34737 (0x87b1) encountered. `TIFFReadDirectory' @ warning/tiff.c/TIFFWarnings/820.
So although this has worked well enough so far, we need to find a way to overlap using the "max" function. I can't see a way to do this using gdal_merge, but I did find this discussion on how to do exactly that. Of the suggested solutions I think using SAGA GIS mosaicking is going to be the best choice. The gdal-only solution there is tempting for simplicity of deployment but it is kludgey. Besides: we might find other uses for SAGA once it is available on our systems.
I did some more work on this and found a few things:
Accomplished using gdaladdo by manually adding the following to the .vrt file:
<VRTRasterBand dataType="Byte" band="1" subClass="VRTDerivedRasterBand">
<PixelFunctionType>maximu</PixelFunctionType>
<PixelFunctionLanguage>Python</PixelFunctionLanguage>
<PixelFunctionCode><![CDATA[
import numpy as np
def maximu(in_ar, out_ar, xoff, yoff, xsize, ysize, raster_xsize,
raster_ysize, buf_radius, gt, **kwargs):
np.round_(np.clip(np.max(in_ar, axis = 0),0,255),
out = out_ar)
]]>
</PixelFunctionCode>
This is described here. It took <30s on 5 files.
[tylar@tylar-pc gdal_mosaic_reduce]$ export GDAL_VRT_ENABLE_PYTHON=YES
[tylar@tylar-pc gdal_mosaic_reduce]$ time gdaladdo -ro circe_ingest-tx-2019_12-edited.vrt 1
Warning: Overview with subsampling factor of 1 requested. This will copy the full resolution dataset in the overview!
0...10...20...30...40...50...60...70...80...90...100 - done.
real 0m31.613s
user 0m26.524s
sys 0m4.998s
solution documented at https://gist.github.com/7yl4r/d03f9617212db5efded1f8a0d34550d3
I just tried running gdaladdo for the SE TX region and found that gdaladdo is RAM-limited. My RAM and swap both filled quickly and then the process was killed. We are going to need to try this on a supercomputer. I don't know how to estimate the amount of RAM needed; I am worried it could be in the terabytes.
Also worth noting that gdaladdo seems to be single-threaded; not ideal.
I'm now attempting to use gdalwarp. I ran a test using 70 images and now am attempting with all images in SE_TX.
[tylar@tylar-pc CETX]$ gdalwarp images/WV*tif -r max /mnt/texas_central/gdalwarp_output.tif
Creating output file that is 327131P x 457567L.
Processing images/WV02_20100101171245_1030010003392800_10JAN01171245-M1BS-500058889180_01_P001_CETX_SOALCHI_filt_3.tif [1/1016] : 0...10...20..
...
A few wv03 images threw an "operation not permitted" error and I moved them from /images/ to /mved/.
I just ran gdalwarp
for the texas_central region:
[tylar@tylar-pc CETX]$ gdalwarp images/WV*tif -r max /mnt/texas_central/gdalwarp_output.tif
1016 images. It took a few days.
Final file is 140G large. Here's a screenshot:
I suspect the tile differences are less profound when calculated using the 'max' function.
I thought that was what gdalwarp was doing, but I think the -r
argument actually refers to scaling and stretching of images; not overlay.
this is not so important now that we have GEE working to mosaic on the fly
I think it is about time we start talking about how to mosaic all these geotiffs. For now let's will focus on smushing all images into one and we can discuss more complex mosaics later.
user needs/wants
Firstly I want to make sure I understand what users would want given no technical limitations. Here are my starting assumptions; I will update as we discuss further:
...is that really it?
technical considerations
solution comparison/discussion
Now I will list resources/options I have found, sorted by my preference, with pro/con (:green_heart:, :broken_heart:) lists.