Closed kendonB closed 3 years ago
Could you make the tif files available somehow?
Not sure whether c
is the right tool for this: c
has an along
argument that lets you combine along any dimension, but in the sense of gluing, like your example, not like compositing, as you describe. Maybe sth like this?
library(stars)
# Loading required package: abind
# Loading required package: sf
# Linking to GEOS 3.7.0, GDAL 2.4.0, PROJ 5.2.0
st_vrt = function(..., options = character(0)) {
dst = tempfile(fileext = ".tif")
lst_write = function(obj) {
fname = tempfile(fileext = ".tif")
write_stars(obj, fname)
fname
}
src = sapply(list(...), lst_write)
gdal_utils("buildvrt", src, dst, options)
#sapply(src, unlink)
ret = read_stars(dst)
#unlink(dst)
ret
}
tif = system.file("tif/L7_ETMs.tif", package = "stars")
x = read_stars(tif)
x1 = x[,5:50, 7:80,]
x2 = x[,10:150, 5:20,]
x3 = x[,80:90, 100:115,]
x4 = x[,3:253, 22:33,]
x5 = x[,33:44, 55:66,]
x6 = x[,9:90, 8:80,]
out = st_vrt(x1,x2,x3,x4,x5,x6)
warnings()
out[[1]][out[[1]]==0] = NA
out
# stars object with 3 dimensions and 1 attribute
# attribute(s):
# file652e2bae051d.tif
# Min. : 11.00
# 1st Qu.: 46.00
# Median : 61.00
# Mean : 62.22
# 3rd Qu.: 75.00
# Max. :255.00
# NA's :109314
# dimension(s):
# from to offset delta refsys point values
# x 1 251 288833 28.5 +proj=utm +zone=25 +south... NA NULL [x]
# y 1 111 9120647 -28.5 +proj=utm +zone=25 +south... NA NULL [y]
# band 1 6 NA NA NA NA NULL
plot(x[,,,1], reset = FALSE)
plot(out[,,,1], col = sf.colors(9), add = TRUE, breaks = seq(10,255,length.out=10))
Whoops I intended to add the files. Added above now.
That function works quite well!
Only thing is it looks like it adds 0s in some bits where they should be NAs:
library(stars)
#> Loading required package: abind
#> Loading required package: sf
#> Linking to GEOS 3.6.1, GDAL 2.2.3, PROJ 4.9.3
library(raster)
#> Loading required package: sp
library(tidyverse)
#> Registered S3 methods overwritten by 'ggplot2':
#> method from
#> [.quosures rlang
#> c.quosures rlang
#> print.quosures rlang
library(cowplot)
#>
#> Attaching package: 'cowplot'
#> The following object is masked from 'package:ggplot2':
#>
#> ggsave
st_vrt = function(..., options = character(0)) {
dst = tempfile(fileext = ".tif")
lst_write = function(obj) {
fname = tempfile(fileext = ".tif")
write_stars(obj, fname)
fname
}
src = sapply(list(...), lst_write)
gdal_utils("buildvrt", src, dst, options)
#sapply(src, unlink)
ret = read_stars(dst)
#unlink(dst)
ret
}
stars_objects = list.files("C:/Users/bellk/Downloads/elevation_nz", full.names = TRUE) %>%
str_subset(".tif") %>%
map(read_stars)
merged = st_vrt(stars_objects[[1]], stars_objects[[2]])
merged = merged %>%
as("Raster") %>%
raster::aggregate(fact = 30) %>%
st_as_stars()
stars_objects = stars_objects %>%
map(as, "Raster") %>%
map(raster::aggregate, fact = 30) %>%
map(st_as_stars)
plot1 = ggplot() +
geom_stars(data = stars_objects[[1]]) +
geom_stars(data = stars_objects[[2]]) +
coord_equal() +
theme(legend.position = "none",
axis.title = element_blank(),
axis.text = element_blank(),
axis.ticks = element_blank(),
axis.line = element_blank()) +
ggtitle("objects 1 and 2")
plot2 = ggplot() +
geom_stars(data = merged) +
coord_equal() +
theme(legend.position = "none",
axis.title = element_blank(),
axis.text = element_blank(),
axis.ticks = element_blank(),
axis.line = element_blank()) +
ggtitle("st_vrted")
plot_grid(plot1, plot2, nrow = 1)
Created on 2019-08-16 by the reprex package (v0.3.0)
Thanks! Specifying options = c("-vrtnodata", "-9999")
in the call to st_vrt
seems to solve the zero problem.
Would st_mosaic
be the better name for this function?
Would
st_mosaic
be the better name for this function?
That'd be nice.
The file name comes in as the layer name when reading tifs. I think it would be good to create a temporary directory then write the new tif inside that. You could expose name
as an argument with layer
as the default which would write layer.tif
. What do you think?
Should the unlink
call be in an on.exit?
Hi @edzer, I've tried to use st_moisac()
lately and stumbled across two things:
st_moisac()
function does not work on the outputs of read_ncdf()
. It is important in the example files, because read_stars()
does not preserves NA values.
You can find the files at mod.zip.library(stars)
files = dir(full.names = TRUE, pattern = "*.hdf$")
# approach 1 --------------------------------------------------------------
ncdf_list_out = lapply(files, read_ncdf, var = "Npp_1km")
out1 = st_mosaic(ncdf_list_out)
# approach 2 --------------------------------------------------------------
ncdf1 = read_ncdf(files[1])
ncdf2 = read_ncdf(files[2])
ncdf3 = read_ncdf(files[3])
out2 = st_mosaic(ncdf1, ncdf2, ncdf3)
# approach 3 --------------------------------------------------------------
stars_list_out = lapply(files, read_stars, sub = "Npp_1km")
out3 = st_mosaic(stars_list_out)
# approach 4 --------------------------------------------------------------
stars1 = read_stars(files[1])
stars2 = read_stars(files[2])
stars3 = read_stars(files[3])
out2 = st_mosaic(stars1, stars2, stars3)
# works!
plot(out2)
do.call
? do.call(st_mosaic, stars_list_out)
Maybe we should add an example in the help of `st_mosaic of its use.
read_ncdf
:
> ncdf1 = read_ncdf(files[1])
Error in nc_meta.character(.x) :
failed to open 'x', value given was: "./MOD17A3.A2014001.h31v09.055.2016008121113.hdf"
and
gdal_utils("info", "MOD17A3.A2014001.h31v09.055.2016008121113.hdf")
also does not recognise these as NetCDF files (nor does the ncdump
utility)
(because they are HDF4 not HDF5 , still used in lots of streams - NetCDF-4 is derived from HDF5, but HDF4 is the old legacy form)
gdalinfo MOD17A3.A2014001.h31v09.055.2016008121113.hdf
Driver: HDF4/Hierarchical Data Format Release 4
Files: MOD17A3.A2014001.h31v09.055.2016008121113.hdf
Size is 512, 512
Coordinate System is `'
Hi @edzer,
do.call()
. However, I think users would prefer that st_mosaic()
just working on the list of file names (if that's possible). read_ncdf()
. > library(stars)
Loading required package: abind
Loading required package: sf
Linking to GEOS 3.7.1, GDAL 2.3.2, PROJ 5.2.0
>
> files = dir(full.names = TRUE, pattern = "*.hdf$")
>
> ncdf1 = read_ncdf(files[1], var = "Npp_1km")
Warning messages:
1: In .get_nc_projection(meta$attribute, rep_var, all_coord_var) :
No projection information found in nc file.
2: ignoring unrecognized unit: kg_C/m^2
> ncdf1
stars object with 2 dimensions and 1 attribute
attribute(s), summary of first 1e+05 cells:
Npp_1km
Min. :0.00
1st Qu.:0.81
Median :0.88
Mean :0.81
3rd Qu.:0.93
Max. :1.25
NA's :92473
dimension(s):
from to offset delta refsys point values
XDim:MOD_Grid_MOD17A3 1 1200 0.5 1 NA NA NULL [x]
YDim:MOD_Grid_MOD17A3 1 1200 0.5 1 NA NA NULL [y]
sf::st_sf()
(and data.frame
, for that sake). What if you'd give two lists to st_mosaic
? Or a vector of data source names? And so on.> sessionInfo()
R version 3.6.1 (2019-07-05)
Platform: x86_64-redhat-linux-gnu (64-bit)
Running under: Fedora 30 (Workstation Edition)
Matrix products: default
BLAS/LAPACK: /usr/lib64/R/lib/libRblas.so
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8
[4] LC_COLLATE=en_US.UTF-8 LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C LC_ADDRESS=C
[10] LC_TELEPHONE=C LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] stars_0.4-0 sf_0.8-0 abind_1.4-5
loaded via a namespace (and not attached):
[1] Rcpp_1.0.2 rstudioapi_0.10 magrittr_1.5 units_0.6-4 tidyselect_0.2.5
[6] R6_2.4.0 rlang_0.4.1.9000 dplyr_0.8.3 tools_3.6.1 parallel_3.6.1
[11] grid_3.6.1 packrat_0.5.0 KernSmooth_2.23-15 ncmeta_0.2.0 e1071_1.7-2
[16] DBI_1.0.0 class_7.3-15 assertthat_0.2.1 lifecycle_0.1.0 tibble_2.1.3
[21] lwgeom_0.1-7 RNetCDF_2.1-1 crayon_1.3.4 tidyr_1.0.0 purrr_0.3.3
[26] vctrs_0.2.0.9007 zeallot_0.1.0 glue_1.3.1 compiler_3.6.1 pillar_1.4.2
[31] backports_1.1.5 classInt_0.4-1 pkgconfig_2.0.3
And st_drivers()
:
name | long_name | write | copy | is_raster | is_vector | vsi | |
---|---|---|---|---|---|---|---|
PCIDSK | PCIDSK | PCIDSK Database File | TRUE | FALSE | TRUE | TRUE | TRUE |
netCDF | netCDF | Network Common Data Format | TRUE | TRUE | TRUE | TRUE | FALSE |
JP2OpenJPEG | JP2OpenJPEG | JPEG-2000 driver based on OpenJPEG library | FALSE | TRUE | TRUE | TRUE | TRUE |
JPEG2000 | JPEG2000 | JPEG-2000 part 1 (ISO/IEC 15444-1), based on Jasper library | FALSE | TRUE | TRUE | TRUE | TRUE |
Geospatial PDF | TRUE | TRUE | TRUE | TRUE | TRUE | ||
MBTiles | MBTiles | MBTiles | TRUE | TRUE | TRUE | TRUE | TRUE |
ESRI Shapefile | ESRI Shapefile | ESRI Shapefile | TRUE | FALSE | FALSE | TRUE | TRUE |
MapInfo File | MapInfo File | MapInfo File | TRUE | FALSE | FALSE | TRUE | TRUE |
UK .NTF | UK .NTF | UK .NTF | FALSE | FALSE | FALSE | TRUE | TRUE |
OGR_SDTS | OGR_SDTS | SDTS | FALSE | FALSE | FALSE | TRUE | TRUE |
S57 | S57 | IHO S-57 (ENC) | TRUE | FALSE | FALSE | TRUE | TRUE |
DGN | DGN | Microstation DGN | TRUE | FALSE | FALSE | TRUE | TRUE |
OGR_VRT | OGR_VRT | VRT - Virtual Datasource | FALSE | FALSE | FALSE | TRUE | TRUE |
REC | REC | EPIInfo .REC | FALSE | FALSE | FALSE | TRUE | FALSE |
Memory | Memory | Memory | TRUE | FALSE | FALSE | TRUE | FALSE |
BNA | BNA | Atlas BNA | TRUE | FALSE | FALSE | TRUE | TRUE |
CSV | CSV | Comma Separated Value (.csv) | TRUE | FALSE | FALSE | TRUE | TRUE |
NAS | NAS | NAS - ALKIS | FALSE | FALSE | FALSE | TRUE | TRUE |
GML | GML | Geography Markup Language (GML) | TRUE | FALSE | FALSE | TRUE | TRUE |
GPX | GPX | GPX | TRUE | FALSE | FALSE | TRUE | TRUE |
LIBKML | LIBKML | Keyhole Markup Language (LIBKML) | TRUE | FALSE | FALSE | TRUE | TRUE |
KML | KML | Keyhole Markup Language (KML) | TRUE | FALSE | FALSE | TRUE | TRUE |
GeoJSON | GeoJSON | GeoJSON | TRUE | FALSE | FALSE | TRUE | TRUE |
ESRIJSON | ESRIJSON | ESRIJSON | FALSE | FALSE | FALSE | TRUE | TRUE |
TopoJSON | TopoJSON | TopoJSON | FALSE | FALSE | FALSE | TRUE | TRUE |
Interlis 1 | Interlis 1 | Interlis 1 | TRUE | FALSE | FALSE | TRUE | TRUE |
Interlis 2 | Interlis 2 | Interlis 2 | TRUE | FALSE | FALSE | TRUE | TRUE |
OGR_GMT | OGR_GMT | GMT ASCII Vectors (.gmt) | TRUE | FALSE | FALSE | TRUE | TRUE |
GPKG | GPKG | GeoPackage | TRUE | TRUE | TRUE | TRUE | TRUE |
SQLite | SQLite | SQLite / Spatialite | TRUE | FALSE | FALSE | TRUE | TRUE |
OGR_DODS | OGR_DODS | OGR_DODS | FALSE | FALSE | FALSE | TRUE | FALSE |
ODBC | ODBC | ODBC | TRUE | FALSE | FALSE | TRUE | FALSE |
WAsP | WAsP | WAsP .map format | TRUE | FALSE | FALSE | TRUE | TRUE |
PGeo | PGeo | ESRI Personal GeoDatabase | FALSE | FALSE | FALSE | TRUE | FALSE |
MSSQLSpatial | MSSQLSpatial | Microsoft SQL Server Spatial Database | TRUE | FALSE | FALSE | TRUE | FALSE |
OGR_OGDI | OGR_OGDI | OGDI Vectors (VPF, VMAP, DCW) | FALSE | FALSE | FALSE | TRUE | FALSE |
PostgreSQL | PostgreSQL | PostgreSQL/PostGIS | TRUE | FALSE | FALSE | TRUE | FALSE |
MySQL | MySQL | MySQL | TRUE | FALSE | FALSE | TRUE | FALSE |
OpenFileGDB | OpenFileGDB | ESRI FileGDB | FALSE | FALSE | FALSE | TRUE | TRUE |
XPlane | XPlane | X-Plane/Flightgear aeronautical data | FALSE | FALSE | FALSE | TRUE | TRUE |
DXF | DXF | AutoCAD DXF | TRUE | FALSE | FALSE | TRUE | TRUE |
CAD | CAD | AutoCAD Driver | FALSE | FALSE | TRUE | TRUE | TRUE |
Geoconcept | Geoconcept | Geoconcept | TRUE | FALSE | FALSE | TRUE | TRUE |
GeoRSS | GeoRSS | GeoRSS | TRUE | FALSE | FALSE | TRUE | TRUE |
GPSTrackMaker | GPSTrackMaker | GPSTrackMaker | TRUE | FALSE | FALSE | TRUE | TRUE |
VFK | VFK | Czech Cadastral Exchange Data Format | FALSE | FALSE | FALSE | TRUE | FALSE |
PGDUMP | PGDUMP | PostgreSQL SQL dump | TRUE | FALSE | FALSE | TRUE | TRUE |
OSM | OSM | OpenStreetMap XML and PBF | FALSE | FALSE | FALSE | TRUE | TRUE |
GPSBabel | GPSBabel | GPSBabel | TRUE | FALSE | FALSE | TRUE | FALSE |
SUA | SUA | Tim Newport-Peace's Special Use Airspace Format | FALSE | FALSE | FALSE | TRUE | TRUE |
OpenAir | OpenAir | OpenAir | FALSE | FALSE | FALSE | TRUE | TRUE |
OGR_PDS | OGR_PDS | Planetary Data Systems TABLE | FALSE | FALSE | FALSE | TRUE | TRUE |
WFS | WFS | OGC WFS (Web Feature Service) | FALSE | FALSE | FALSE | TRUE | TRUE |
WFS3 | WFS3 | OGC WFS 3 client (Web Feature Service) | FALSE | FALSE | FALSE | TRUE | FALSE |
HTF | HTF | Hydrographic Transfer Vector | FALSE | FALSE | FALSE | TRUE | TRUE |
AeronavFAA | AeronavFAA | Aeronav FAA | FALSE | FALSE | FALSE | TRUE | TRUE |
Geomedia | Geomedia | Geomedia .mdb | FALSE | FALSE | FALSE | TRUE | FALSE |
EDIGEO | EDIGEO | French EDIGEO exchange format | FALSE | FALSE | FALSE | TRUE | TRUE |
GFT | GFT | Google Fusion Tables | TRUE | FALSE | FALSE | TRUE | FALSE |
SVG | SVG | Scalable Vector Graphics | FALSE | FALSE | FALSE | TRUE | TRUE |
CouchDB | CouchDB | CouchDB / GeoCouch | TRUE | FALSE | FALSE | TRUE | FALSE |
Cloudant | Cloudant | Cloudant / CouchDB | TRUE | FALSE | FALSE | TRUE | FALSE |
Idrisi | Idrisi | Idrisi Vector (.vct) | FALSE | FALSE | FALSE | TRUE | TRUE |
ARCGEN | ARCGEN | Arc/Info Generate | FALSE | FALSE | FALSE | TRUE | TRUE |
SEGUKOOA | SEGUKOOA | SEG-P1 / UKOOA P1/90 | FALSE | FALSE | FALSE | TRUE | TRUE |
SEGY | SEGY | SEG-Y | FALSE | FALSE | FALSE | TRUE | TRUE |
XLS | XLS | MS Excel format | FALSE | FALSE | FALSE | TRUE | FALSE |
ODS | ODS | Open Document/ LibreOffice / OpenOffice Spreadsheet | TRUE | FALSE | FALSE | TRUE | TRUE |
XLSX | XLSX | MS Office Open XML spreadsheet | TRUE | FALSE | FALSE | TRUE | TRUE |
ElasticSearch | ElasticSearch | Elastic Search | TRUE | FALSE | FALSE | TRUE | FALSE |
Walk | Walk | Walk | FALSE | FALSE | FALSE | TRUE | FALSE |
Carto | Carto | Carto | TRUE | FALSE | FALSE | TRUE | FALSE |
AmigoCloud | AmigoCloud | AmigoCloud | TRUE | FALSE | FALSE | TRUE | FALSE |
SXF | SXF | Storage and eXchange Format | FALSE | FALSE | FALSE | TRUE | TRUE |
Selafin | Selafin | Selafin | TRUE | FALSE | FALSE | TRUE | TRUE |
JML | JML | OpenJUMP JML | TRUE | FALSE | FALSE | TRUE | TRUE |
PLSCENES | PLSCENES | Planet Labs Scenes API | FALSE | FALSE | TRUE | TRUE | FALSE |
CSW | CSW | OGC CSW (Catalog Service for the Web) | FALSE | FALSE | FALSE | TRUE | FALSE |
VDV | VDV | VDV-451/VDV-452/INTREST Data Format | TRUE | FALSE | FALSE | TRUE | TRUE |
GMLAS | GMLAS | Geography Markup Language (GML) driven by application schemas | FALSE | TRUE | FALSE | TRUE | TRUE |
MVT | MVT | Mapbox Vector Tiles | TRUE | FALSE | FALSE | TRUE | TRUE |
TIGER | TIGER | U.S. Census TIGER/Line | TRUE | FALSE | FALSE | TRUE | TRUE |
AVCBin | AVCBin | Arc/Info Binary Coverage | FALSE | FALSE | FALSE | TRUE | TRUE |
AVCE00 | AVCE00 | Arc/Info E00 (ASCII) Coverage | FALSE | FALSE | FALSE | TRUE | TRUE |
HTTP | HTTP | HTTP Fetching Wrapper | FALSE | FALSE | TRUE | TRUE | FALSE |
Would it be an option to have an use_mosaic
as a logical argument in read_stars
, or alternatively make along
in read_stars
(and c.stars
) accept "use_mosaic"
?
Right now I am doing
x <- lapply(hgtfiles, read_stars)
dem <- do.call("st_mosaic", x)
but would prefer doing
dem <- read_stars(hgtfiles, along = "use_mosaic")
or
dem <- read_stars(hgtfiles, use_mosaic = TRUE)
What do you think?
BTW: Could you add st_mosaic
to the see also list of c.stars
?
Looking back at how things are implemented, and imagining that users will want to do this on large files, I think I'm going to go back to what @Nowosad proposed: having multiple ways to invoke mosaic. I suggest to make st_mosaic
a generic, and have a method that works on file names only; in your workflow that would become
dem = read_stars(st_mosaic(hgtfiles))
This would avoid reading all hgtfiles
first, then writing them to disk before creating the mosaic, and might save a lot of memory.
... and that should now work.
works nicely with .vrt too, so no copy of the data required with read_stars(vrt, proxy = TRUE)
Very nice!
However, the st_mosaic
method for character should return the name of the dst
file (Otherwise sf::gdal_utils
return a TRUE
invisibly). Something like this:
st_mosaic.character = function(.x, ..., dst = tempfile(fileext = file_ext),
options = c("-vrtnodata", "-9999"), file_ext = ".tif") {
sf::gdal_utils("buildvrt", .x, dst, options)
dst
}
There are two cases to consider. The input stars objects either have:
One approach could be to:
A real life example that currently fails (case 2.):
elevation_nz.zip
Created on 2019-08-14 by the reprex package (v0.3.0)