locationtech / geotrellis

GeoTrellis is a geographic data processing engine for high performance applications.
http://geotrellis.io
Other
1.34k stars 361 forks source link

Handling very large TIF Files and converting them to an internal layer id #2774

Open tetmonis opened 6 years ago

tetmonis commented 6 years ago

Several issues here address the problem of loading very large TIF Files like 10-20 GB Files. I would like to know whether the following approach is valuable in this context: When I am able load a sizeable TIF File of 14 GB into a TileLayerRDD[SpatialKey]. Why would it not be possible to load another TileLayerRDD[SpatialKey] into another TIF File and the merge the both? Is there any way to combine this two TileLayerRDDs into a single larger one, either by expanding the first TileLayerRDD with the second one or by another mechanism? I would assume the same resolution and a sort of uniform tile size with both files. Is there any way then to combine them to a single bigger TileLayerRDD which then could be saved to an internal LayerID with HadoopLayerWriter? I searched some time for a solution but I seems like either nobody thought of this approach or it is simply to trivial to mention. Regards martin

pomadchin commented 5 years ago

Hey @tetmonis , can you clarify a bit what means load another TileLayerRDD[SpatialKey] into another TIF File can you clarify it a bit? It will be also useful if you'll provide some pseudocode of how you want it to work.

tetmonis commented 5 years ago

Hi Grigory,

I solved my problem very easily but it wasn’t obvious. I wanted to read several different TIF-Files into a single TileLayerRDD The solution is to place all TIF-Files in a single directory and then to call HadoopGeoTiffRDD.

Surprisingly it simply merges all TIFF File into a single RDD then. Great. You could somehow document this feature. Even in #2774

Regards

-Martin

val inputpath1 = new Path("hdfs://muchdpprdsh01/org/prp-p670-mind/Geotrellis/KatRisk/RAW/") implicit val sparkContext = sc val rr = implicitly[geotrellis.spark.io.RasterReader[geotrellis.spark.io.hadoop.HadoopGeoTiffRDD.Options, (geotrellis.vector.ProjectedExtent, geotrellis.raster.MultibandTile)]] implicit val rasterreader = rr val rdd1 = geotrellis.spark.io.hadoop.HadoopGeoTiffRDD.apply(inputpath1,geotrellis.spark.io.hadoop.HadoopGeoTiffRDD.Options(maxTileSize = Some(1000),chunkSize=Some(1024)))

Von: Grigory notifications@github.com Gesendet: Montag, 7. Januar 2019 14:28 An: locationtech/geotrellis geotrellis@noreply.github.com Cc: Arnoldi Dr. Martin - Munich-MR MArnoldi@munichre.com; Mention mention@noreply.github.com Betreff: Re: [locationtech/geotrellis] Handling very large TIF Files and converting them to an internal layer id (#2774)

Hey @tetmonishttps://github.com/tetmonis , can you clarify a bit what means load another TileLayerRDD[SpatialKey] into another TIF File can you clarify it a bit? It will be also useful if you'll provide some pseudocode of how you want it to work.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/locationtech/geotrellis/issues/2774#issuecomment-451934404, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AXMW-4mNmSws6VEGUDRmUaNIcLfGKOMVks5vA0tYgaJpZM4VoLvV.

Münchener Rückversicherungs-Gesellschaft (“Munich Reinsurance Company”) is a reinsurance company organized under the laws of Germany. In some countries, including in the United States, Munich Reinsurance Company holds the status of an unauthorized reinsurer. Policies are underwritten by Munich Reinsurance Company or its affiliated insurance and reinsurance subsidiaries. Certain coverages are not available in all jurisdictions.

Münchener Rückversicherungs- Gesellschaft Aktiengesellschaft in München Königinstraße 107, 80802 München Sitz der Gesellschaft: München Amtsgericht München, HRB 42039

Vorsitzender des Aufsichtsrats: Dr. Bernd Pischetsrieder Vorstand: Dr. Joachim Wenning, Vorsitzender; Dr. Thomas Blunck, Dr. Doris Höpke, Dr. Torsten Jeworrek, Dr. Christoph Jurecka, Hermann Pohlchristoph, Dr. Markus Rieß, Dr. Peter Röder

Information zur Datenverarbeitung: https://www.munichre.com/de/service/information- gdpr/index.html

Information on data protection: https://www.munichre.com/en/service/information- gdpr/index.html

pomadchin commented 5 years ago

@tetmonis I agree with you, thanks for your feedback. Will keep this issue open and mark it with a docs label