Excessive ram usage with GDAL warp and translate in versions 3.6.0 and greater

Ijaswanth82 commented 2 months ago

What is the bug?

When gdal warp and translate are put under load( approximayely 25 simulataneous executions),the memory usage of the process is raising very steeply and becomes unpredictable.I am manipulating multiple tiff files to a get single raster of area of interest by specifying BBOX and using gdal_translate for converting it to jpeg/png.The final images are in the order of 4-5mb and even with 25 simulataneous requests the memory usage of the process that uses the warp is crossing 4gb.The interaction with gdal is via java bindings in my application.

Steps to reproduce the issue

Concurrent threads(approx 25) running gdal warp and gdal_translate with java bindings will replicate issue.

Versions and provenance

The application is run in the docker image of Alpine OS and the docker is run in Amazon linux 2.Faced the same issue with all gdal versions>3.6.0

Additional context

My application is a rest based application to serve rasters which leverages gdal in tiff file manipulation for geospatial operations.

jratike80 commented 2 months ago

Do you mean that with GDAL versions <3.6.0 the ram usage is low?

I believe that next someone will ask you to give more details about what you are doing. Think this way: Can someone re-produce my use case with the information that I have given?

rouault commented 2 months ago

As it, this report lacks details to be actionable:

Is this is a regression compared to previous GDAL versions? If so, it would help immensely if you could identify the precise version where the behavior changed, and even better, if you could "git bisect" down to the offending commit
Do you set the GDAL_NUM_THREADS configuration option? It can increase RAM consumption
Please share the output of gdalinfo on a typical TIFF file
If its indeed a regression and you can't yourself identify the offending commit, then you'll likely have to produce a ready-made minimum reproducer for us to be able to investigate. This is a too much complicated topic to be able to guess

Ijaswanth82 commented 2 months ago

I have not tested with older versions (<3.6.0).Assuming that the memory issues would have been fixed in later versions.

Do you mean that with GDAL versions <3.6.0 the ram usage is low?

I believe that next someone will ask you to give more details about what you are doing. Think this way: Can someone re-produce my use case with the information that I have given?

Ijaswanth82 commented 2 months ago

As it, this report lacks details to be actionable:

Is this is a regression compared to previous GDAL versions? If so, it would help immensely if you could identify the precise version where the behavior changed, and even better, if you could "git bisect" down to the offending commit

Do you set the GDAL_NUM_THREADS configuration option? It can increase RAM consumption

Please share the output of gdalinfo on a typical TIFF file

If its indeed a regression and you can't yourself identify the offending commit, then you'll likely have to produce a ready-made minimum reproducer for us to be able to investigate. This is a too much complicated topic to be able to guess

1.The behaviour remained the same in all the versions we used(v3.6.0 to v3.9.2).There is no precise version where a change in behaviour was observed. 2.GDAL_NUM_THREADS is not set explictly.only GDAL_CACHEMAX is set to 10%. 3.sample gdalinfo output Driver: GTiff/GeoTIFF Files: Size is 4000, 4000 Coordinate System is: PROJCRS["WGS 84 / UTM zone 32N", BASEGEOGCRS["WGS 84", DATUM["World Geodetic System 1984", ELLIPSOID["WGS 84",6378137,298.257223563, LENGTHUNIT["metre",1]]], PRIMEM["Greenwich",0, ANGLEUNIT["degree",0.0174532925199433]], ID["EPSG",4326]], CONVERSION["UTM zone 32N", METHOD["Transverse Mercator", ID["EPSG",9807]], PARAMETER["Latitude of natural origin",0, ANGLEUNIT["degree",0.0174532925199433], ID["EPSG",8801]], PARAMETER["Longitude of natural origin",9, ANGLEUNIT["degree",0.0174532925199433], ID["EPSG",8802]], PARAMETER["Scale factor at natural origin",0.9996, SCALEUNIT["unity",1], ID["EPSG",8805]], PARAMETER["False easting",500000, LENGTHUNIT["metre",1], ID["EPSG",8806]], PARAMETER["False northing",0, LENGTHUNIT["metre",1], ID["EPSG",8807]]], CS[Cartesian,2], AXIS["(E)",east, ORDER[1], LENGTHUNIT["metre",1]], AXIS["(N)",north, ORDER[2], LENGTHUNIT["metre",1]], USAGE[ SCOPE["Navigation and medium accuracy spatial referencing."], AREA["Between 6°E and 12°E, northern hemisphere between equator and 84°N, onshore and offshore. Algeria. Austria. Cameroon. Denmark. Equatorial Guinea. France. Gabon. Germany. Italy. Libya. Liechtenstein. Monaco. Netherlands. Niger. Nigeria. Norway. Sao Tome and Principe. Svalbard. Sweden. Switzerland. Tunisia. Vatican City State."], BBOX[0,6,84,12]], ID["EPSG",32632]] Data axis to CRS axis mapping: 1,2 Origin = (373844.038312499993481,5679831.800914060324430) Pixel Size = (0.142000000000000,-0.142000000000000) Metadata: TIFFTAG_SOFTWARE=UltraMap Aerial v5.5.1 (Build 38.2.2204.1201) AREA_OR_POINT=Area Image Structure Metadata: LAYOUT=COG COMPRESSION=DEFLATE INTERLEAVE=PIXEL PREDICTOR=2 Corner Coordinates: Upper Left ( 373844.038, 5679831.801) ( 7d11'31.57"E, 51d15'21.13"N) Lower Left ( 373844.038, 5679263.801) ( 7d11'32.29"E, 51d15' 2.75"N) Upper Right ( 374412.038, 5679831.801) ( 7d12' 0.85"E, 51d15'21.58"N) Lower Right ( 374412.038, 5679263.801) ( 7d12' 1.57"E, 51d15' 3.20"N) Center ( 374128.038, 5679547.801) ( 7d11'46.57"E, 51d15'12.17"N) Band 1 Block=128x128 Type=UInt16, ColorInterp=Red NoData Value=0 Overviews: 2000x2000, 1000x1000, 500x500, 250x250 Band 2 Block=128x128 Type=UInt16, ColorInterp=Green NoData Value=0 Overviews: 2000x2000, 1000x1000, 500x500, 250x250 Band 3 Block=128x128 Type=UInt16, ColorInterp=Blue NoData Value=0 Overviews: 2000x2000, 1000x1000, 500x500, 250x250 Band 4 Block=128x128 Type=UInt16, ColorInterp=Undefined NoData Value=0 Overviews: 2000x2000, 1000x1000, 500x500, 250x250 4.As said above, there is no regression.

jratike80 commented 2 months ago

What do you mean by "approximayely 25 simulataneous executions"? Do you start that many gdalwarp processes with your code?

rouault commented 2 months ago

As your use case involves multi-threading, you might want to rule out a potential issue with RAM fragmentation mentionned at https://gdal.org/en/latest/user/multithreading.html#ram-fragmentation-and-multi-threading

Ijaswanth82 commented 2 months ago

What do you mean by "approximayely 25 simulataneous executions"? Do you start that many gdalwarp processes with your code?

My application is rest based application which serves requests by using GDAL in the backend for every request .By having 25 simultaneous executions I am trying to simulate the load.In short,I start that many gdalwarp processes with my application's code to serve that many requests.

jratike80 commented 2 months ago

So if I understand right, it is not multithreading but running 25 gdalwarp or gdal_translate programs at the same time. I guess that the operating system in that case gives memory for the GDAL programs and because they are started from Java, also Jave adds some memory requirements into the mix. I would expect that the memory consumption grows linearly in the beginning. Have you tested with 1, 2, 3, 4.... processes?

I am not a programmer and I do not know if your approach is a good or not approach, but I feel that maybe it is not. Perhaps you should warp or translate with Java code instead of starting GDAL programs.

I am remembering that I have been doing something similar than you by starting several simultaneous gdalwarp programs with a script on Windows. By that time I noticed that it did not help to start more gdalwarps than I had physical processor cores, and even that might be too much if the bottleneck was not processing but the speed of the file system.

Ijaswanth82 commented 2 months ago

So if I understand right, it is not multithreading but running 25 gdalwarp or gdal_translate programs at the same time. I guess that the operating system in that case gives memory for the GDAL programs and because they are started from Java, also Jave adds some memory requirements into the mix. I would expect that the memory consumption grows linearly in the beginning. Have you tested with 1, 2, 3, 4.... processes?

I am not a programmer and I do not know if your approach is a good or not approach, but I feel that maybe it is not. Perhaps you should warp or translate with Java code instead of starting GDAL programs.

I am remembering that I have been doing something similar than you by starting several simultaneous gdalwarp programs with a script on Windows. By that time I noticed that it did not help to start more gdalwarps than I had physical processor cores, and even that might be too much if the bottleneck was not processing but the speed of the file system.

I am actually starting warp or translate with Java code instead of starting GDAL programs as part of my application.It is just for reference that i told i am executing multiple gdal warps.But in reality my application launches multiple threads and in each of them gdal warp is getting executed via gdal jni calls in java.

To give more context,For every request to my application there will be 1-5 gdal warp,1 gdal_translate executeions in the backend.Since my appplication is in java,there will be some heap memory allocated(2gb in our case).But the problem is that memory management of the warp and translate function will not come under JVM heap memory since we are calling C/CPP native code of GDAL.When we did load test our application's memory usage has crossed 4gb but the limit to jvm heap memory is set to 2gb.This additional memory utilization is because of the memory management of the warp and translate functions in C/CPP which at the moment we don't have control over.We have also observed that the JVM heap heap memory is occupied only in the order of 400mb out of the 4gb. I am looking for any suggestion to bring down memory utilization of the non JVM/non heap memory so that my applicaztion will not face Out of memory exceptions.

Ijaswanth82 commented 2 months ago

As your use case involves multi-threading, you might want to rule out a potential issue with RAM fragmentation mentionned at https://gdal.org/en/latest/user/multithreading.html#ram-fragmentation-and-multi-threading

I have actually looked into this section and tried building tcmalloc but my OS is alpine i have encountered built issues.I will try once more.

pjonsson commented 2 months ago

Do you have any indication that it should be possible to run 25 concurrent computations in the amount of RAM you have available? I run 8 concurrent GDAL commands in the shell and my RAM usage for them is 0.5-1.2 GB per command.

I don't know how the JNI bindings work, but you can try tuning the GDAL_CACHEMAX value.

As @jratike80 is saying, you are most likely to just get churn when you go beyond the number of cores available for processing. If you make a thread/worker pool in Java with a fixed number of threads, and then have your Java threads submit jobs to that pool you should be able to get a bounded resource consumption of your application. There is a number of variations of worker pools available depending on which Java version you use, so with a modern Java you should be able to find something that suits your application.

OSGeo / gdal