Open qw845602 opened 2 years ago
Hey @qw845602, could you minimize example? i.e:
GDALRasterSource("path/to/tiff").rasterExtent
The other thing is GDAL Error Code: 4
: it can mean so many things, could you post a stack trace here as well? It usually writes below down what was the function that caused problems.
Also for the context from Gitter: there is a good chance, that GDAL is improperly installed / java.library.path is improperly set, and it all can be connected.
Hey @qw845602, could you minimize example? i.e:
GDALRasterSource("path/to/tiff").rasterExtent
The other thing is
GDAL Error Code: 4
: it can mean so many things, could you post a stack trace here as well? It usually writes below down what was the function that caused problems.Also for the context from Gitter: there is a good chance, that GDAL is improperly installed / java.library.path is improperly set, and it all can be connected.
What is a stack trace? Just indicate which function cause the problem or where the error occurs?
The stack trace is the actual error that includes the functions stack call, you already sent it in gitter.
Ok, here it is:
Caused by: geotrellis.raster.gdal.MalformedDataException: Unable to construct dataset dimensions. GDAL Error Code: 4
at geotrellis.raster.gdal.GDALDataset$.$anonfun$dimensions$1(GDALDataset.scala:160)
at geotrellis.raster.gdal.GDALDataset$.$anonfun$dimensions$1$adapted(GDALDataset.scala:157)
at geotrellis.raster.gdal.GDALDataset$.errorHandler$extension(GDALDataset.scala:406)
at geotrellis.raster.gdal.GDALDataset$.dimensions$extension1(GDALDataset.scala:157)
at geotrellis.raster.gdal.GDALDataset$.rasterExtent$extension1(GDALDataset.scala:197)
at geotrellis.raster.gdal.GDALRasterSource.gridExtent$lzycompute(GDALRasterSource.scala:93)
at geotrellis.raster.gdal.GDALRasterSource.gridExtent(GDALRasterSource.scala:93)
at geotrellis.raster.RasterMetadata.extent(RasterMetadata.scala:52)
at geotrellis.raster.RasterMetadata.extent$(RasterMetadata.scala:52)
at geotrellis.raster.RasterSource.extent(RasterSource.scala:43)
at geotrellis.spark.RasterSummary$.$anonfun$collect$1(RasterSummary.scala:108)
at scala.collection.Iterator$$anon$10.next(Iterator.scala:459)
at org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:194)
at org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:62)
at org.apache.spark.shuffle.ShuffleWriteProcessor.write(ShuffleWriteProcessor.scala:59)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:99)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:52)
at org.apache.spark.scheduler.Task.run(Task.scala:131)
at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:497)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1439)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:500)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.lang.Thread.run(Thread.java:834)
Hey @qw845602, could you minimize example? i.e:
GDALRasterSource("path/to/tiff").rasterExtent
The other thing is
GDAL Error Code: 4
: it can mean so many things, could you post a stack trace here as well? It usually writes below down what was the function that caused problems.Also for the context from Gitter: there is a good chance, that GDAL is improperly installed / java.library.path is improperly set, and it all can be connected.
The error was uploaded in bugreport2. I didn't find GDALRasterSource("path/to/tiff").rasterExtent function and used GDALRasterSource("path/to/tiff").dimensions instead. The error is "[1 of 1000] FAILURE(3) CPLE_OpenFailed(4) "Open failed." /geo/file/raster/RS/Landsat/L71149033_03320030531_B10.TIF no such file or directory" . It seems that GDAL could not find the Tiff. What needs to be mentioned is that the tiff file is stored in HDFS.
@qw845602 well that's a different issue, you'd need to have a GDAL build with HDFS support; I don't believe it's enabled y default.
Also this approach even in case it works may lead to extra overhead caused by the extra JVM that GDAL will create to establish HDFS connection.
@qw845602 well that's a different issue, you'd need to have a GDAL build with HDFS support
Can GDALRasterSource read TIFF directly from the disk in the spark cluster? Does it needs to put the tiff on each node at the same location? I remembered that HadoopGeotiffRDD could not read the tif file in the disk in spark cluster mode.
@qw845602 both HadoopGeotiffRDD
and GDALRasterSource can read files directly from cluster local disks, yes, in this case you'd need to have copies all over the places.
We've never encountered these issues since were relying mostly on S3 storage, and GDAL supports S3 reads by default.
@qw845602 both
HadoopGeotiffRDD
and GDALRasterSource can read files directly from cluster local disks, yes, in this case you'd need to have copies all over the places.We've never encountered these issues since were relying mostly on S3 storage, and GDAL supports S3 reads by default.
Yeah, I have tried to read from the cluster local disks, the code and errror are shown in bugreport3. The error is " java.lang.IllegalArgumentException: requirement failed: x-aligned: offset by CellSize".It is also an error occured when the code is run in local mode, which i have mentioned before in the thread of gitter.
@qw845602 could you post a minimized code to reproduce requirement failed: x-aligned: offset by CellSize
? I believe this is related to tiling to layout though, not to reading.
@qw845602 let me summarize:
GDAL
installation clsuter wideTIFFs
are located in HDFS
so that's problematic to read, and it definitely explains the GDAL Error 4
that you had
GDAL
with HDFS
support installed on all nodes, and to have them configured so GDAL
has access to HDFS
GDALRasterSource
works fine with local reads, however you experience issues with tiling it to layout
GDALRasterSource
is used related to
Is it a correct summary?
@qw845602 could you post a minimized code to reproduce
requirement failed: x-aligned: offset by CellSize
? I believe this is related to tiling to layout though, not to readi@qw845602 let me summarize:
- There are problems with
GDAL
installation on a clusterTIFFs
are located onHDFS
so that's problematic to read, and it definitely explains theGDAL Error 4
that you had
- Solution to that is to have
GDAL
withHDFS
support installed on all nodes, and to have them configured soGDAL
has access toHDFS
- When trying local reads,
GDALRasterSource
works, however you experience issues when performing tiling to layout
- There are indeed some issues related to GDAL reads and tiling, and it is caputred here: LayoutTileSource.requireGridAligned is failing with GDALRasterSource #3292
- The initial reason why GDAL is related to
- Too large file sizes, which may trigger Read single-band TIFF files, large than 2G #3065
- Too large segments (all TIFFs you work with are striped) Potential Issue With GeoTiff Reading in the Future due to too large segments dimensions #1691
Is it a correct summary?
1 to 2 are correct. For summary 3, I am not quite sure is it related with performing tiling to layout. I found it needs to indicate the layoutcheme in https://github.com/pomadchin/vlm-performance/blob/feature/gt-3.x/src/main/scala/geotrellis/contrib/performance/IngestRasterSource.scala#L52:L59, I only know two types of layoutscheme, including ZoomedLayoutScheme and FloatingLayoutScheme. Since the tif need to be processed as a pyramid, i chose the FloatingLayoutScheme. Are there any other solutions to create a "TileLayerRDD[SpatialKey]" using GDALRasterSource? I have read the link in summary 3, but i have not find a solution to that. For summay 4, yeah, the tif file is very large, about several hundred GB, but i am not quite sure about the reason. It encounters ArrayIndexOutOfBoundsException error using HadoopGeotiffRdd.
@qw845602 yea, 3.
is exactly about it; :+1:
I'm afraid there are no quick / easy solutions to your problem: or to figure out GDAL issues and get really deep into it, or to use GDAL to convert TIFFs into tiled and compressed TIFFs: gdal_translate in.tif out.tif -co TILED=YES -co COMPRESS=LZW
The last one would not hurt to try, at least to check that it can work as expected with your data.
I have translated the tif using the command gdal_translate in.tif out.tif -co TILED=YES -co COMPRESS=LZW, however, the error " java.lang.IllegalArgumentException: requirement failed: x-aligned: offset by CellSize" still exist. It is so strange.
@qw845602 is it by using non GDAL reads? Try it without GDAL
@qw845602 yea,
3.
is exactly about it; 👍I'm afraid there are no quick / easy solutions to your problem: or to figure out GDAL issues and get really deep into it, or to use GDAL to convert TIFFs into tiled and compressed TIFFs:
gdal_translate in.tif out.tif -co TILED=YES -co COMPRESS=LZW
The last one would not hurt to try, at least to check that it can work as expected with your data.
I have upload the tif after translated as well as the code and error in bugreport4. I have also tried zoomlayoutscheme, but it also cause the same error. So i don't know how to deal with the layoutscheme.
How to Try it without GDAL?
@qw845602 is it by using non GDAL reads? Try it without GDAL
Some error occured in uploading bugreport4, now it is uploaded successfully. Is it mean that I need to translate the tif which caused Arrayindexoutofbound error and to see if it could be read by HadoopGeoTiffRDD?
@qw845602 yes, you may try HadoopGeoTiffRDD, but you can also replace GDALRasterSource
with RasterSource
- it will use non GDAL underlying reader
@qw845602 yea,
3.
is exactly about it; 👍I'm afraid there are no quick / easy solutions to your problem: or to figure out GDAL issues and get really deep into it, or to use GDAL to convert TIFFs into tiled and compressed TIFFs:
gdal_translate in.tif out.tif -co TILED=YES -co COMPRESS=LZW
The last one would not hurt to try, at least to check that it can work as expected with your data.
Yeah,it works by using the command "gdal_translate in.tif out.tif -co BIGTIFF=YES -co TILED=YES -co COMPRESS=LZW", After translating the tif, I can read the tif as rdd using the function hadoopGeoTiffRDD.
@pomadchin Hello, I'm currently working on using the geotrellis-server project to publish a WMTS service. I'm providing a data link as the source: "file:///E:/Geotrellis/Tiles/attributes?layers=tiles&zoom=10&band_count=1". Under this path, I have pre-cut tile data using Geotrellis.
I'm using Scala 2.12.8, Geotrellis 3.6.1, and GDAL 3.0.4. And I'm on Windows operating system. My stack trace is as follows: 17:36:31.296 [raster-io-0] DEBUG geotrellis.server.ogc.Main - GetCapabilities: /?SERVICE=WMS&REQUEST=GetCapabilities 17:36:31.369 [raster-io-0] ERROR org.http4s.server.service-errors - Error servicing request: GET / from 127.0.0.1 geotrellis.raster.gdal.MalformedDataException: Unable to construct dataset dimensions. GDAL Error Code: 4 at geotrellis.raster.gdal.GDALDataset$.$anonfun$dimensions$1(GDALDataset.scala:160) at geotrellis.raster.gdal.GDALDataset$.$anonfun$dimensions$1$adapted(GDALDataset.scala:157) at geotrellis.raster.gdal.GDALDataset$.errorHandler$extension(GDALDataset.scala:422) at geotrellis.raster.gdal.GDALDataset$.dimensions$extension1(GDALDataset.scala:157) at geotrellis.raster.gdal.GDALDataset$.rasterExtent$extension1(GDALDataset.scala:197) at geotrellis.raster.gdal.GDALRasterSource.gridExtent$lzycompute(GDALRasterSource.scala:93) at geotrellis.raster.gdal.GDALRasterSource.gridExtent(GDALRasterSource.scala:93) at geotrellis.server.ogc.wms.CapabilitiesView$.$anonfun$modelAsLayer$2(CapabilitiesView.scala:277) at scala.collection.immutable.List.map(List.scala:293) at geotrellis.server.ogc.wms.CapabilitiesView$.$anonfun$modelAsLayer$1(CapabilitiesView.scala:265) at map @ geotrellis.server.ogc.wms.CapabilitiesView$.modelAsLayer(CapabilitiesView.scala:264) at mapN @ geotrellis.server.ogc.wms.CapabilitiesView$.modelAsLayer(CapabilitiesView.scala:291) at mapN @ geotrellis.server.ogc.wms.CapabilitiesView$.modelAsLayer(CapabilitiesView.scala:291) at map @ geotrellis.server.ogc.wms.CapabilitiesView.toXML(CapabilitiesView.scala:111) at flatMap @ geotrellis.server.ogc.wms.WmsView.$anonfun$responseFor$5(WmsView.scala:142) at delay @ io.chrisdavenport.log4cats.slf4j.internal.Slf4jLoggerInternal$Slf4jLogger.$anonfun$debug$4(Slf4jLoggerInternal.scala:68) at delay @ io.chrisdavenport.log4cats.slf4j.internal.Slf4jLoggerInternal$Slf4jLogger.isDebugEnabled(Slf4jLoggerInternal.scala:50) at ifM$extension @ io.chrisdavenport.log4cats.slf4j.internal.Slf4jLoggerInternal$Slf4jLogger.info(Slf4jLoggerInternal.scala:76) at >>$extension @ geotrellis.server.ogc.wms.WmsView.responseFor(WmsView.scala:141) at sequence @ org.http4s.HttpRoutes$.$anonfun$of$2(HttpRoutes.scala:79) at defer @ org.http4s.HttpRoutes$.$anonfun$of$1(HttpRoutes.scala:79) at $anonfun$combineK$1 @ org.http4s.syntax.KleisliResponseOps.$anonfun$orNotFound$1(KleisliSyntax.scala:49) at getOrElse @ org.http4s.syntax.KleisliResponseOps.$anonfun$orNotFound$1(KleisliSyntax.scala:49) at defer @ org.http4s.server.blaze.Http1ServerStage$$anon$2.run(Http1ServerStage.scala:200) at flatMap @ org.http4s.server.blaze.Http1ServerStage$$anon$2.run(Http1ServerStage.scala:202) [1 of 1000] FAILURE(3) CPLE_OpenFailed(4) "Open failed." `/E:/Geotrellis/Tiles/attributes?layers=tiles&zoom=10&band_count=1' does not exist in the file system, and is not recognized as a supported dataset name.
How can I solve this problem? Thank you very much!
Describe the bug
Cannot read the Tiff file by GDALRasterSource. Unable to construct dataset dimensions. GDAL Error Code: 4
To Reproduce
Provide as able:
Expected behavior
A clear and concise description of what you expected to happen.
Screenshots
If applicable, add screenshots to help explain your problem.
Environment
CentOS Linux release 7.9.2009 (Core)
Java version:
java version "11.0.12" 2021-07-20 LTS Java(TM) SE Runtime Environment 18.9 (build 11.0.12+8-LTS-237) Java HotSpot(TM) 64-Bit Server VM 18.9 (build 11.0.12+8-LTS-237, mixed mode)
Scala version:
2.12.8
GeoTrellis version:
3.5.2
Additional context
Add any other context about the problem here. bugreport.zip bugreport2.zip bugreport3.zip bugreport4.zip