mmomtchev / node-gdal-async

Node.js bindings for GDAL (Geospatial Data Abstraction Library) with full async support
https://mmomtchev.github.io/node-gdal-async/
Apache License 2.0
124 stars 25 forks source link

`rasterizeAsync()` leaves incomplete output file #132

Closed tastott closed 6 months ago

tastott commented 6 months ago

Version: 3.8.3 OS: Windows and Linux

When using gdal.rasterizeAsync() to convert geojson to geotiff, the output file is incomplete until the program ends. After the program ends, the output file appears to be correct. Perhaps there is an unflushed buffer somewhere?

const geojson = await gdal.openAsync('./input.geojson');
await gdal.rasterizeAsync('./output.tif', geojson, [ ... ]);

const tifFileStats = await fs.stat('./output.tif');
tifFileStats.size // file size indicates that tif file is incomplete at this point

I haven't encountered the same problem with other operations which create a raster image. E.g. gdal-translate.

Here's a minimal node project which demonstrates the issue: https://github.com/tastott/gdal-async-rasterize-issue

Really useful project by the way, thanks!

tastott commented 6 months ago

I found a workaround, which is to flush the output dataset yourself. I haven't had to do this with translate() / translateAsync(), so I'm guessing it's not intended that the output dataset from rasterize() / rasterizeAsync() is "unflushed"?

const geojson = await gdal.openAsync('./input.geojson');
const output = await gdal.rasterizeAsync('./output.tif', geojson, [ ... ]); // Now assigning return value to `output`

let tifFileStats = await fs.stat('./output.tif');
tifFileStats.size // file size indicates that tif file is incomplete at this point

// Flush output dataset yourself
await output.flushAsync();
output.close();

tifFileStats = await fs.stat('./output.tif');
tifFileStats.size // file size now indicates that tif file is complete
mmomtchev commented 6 months ago

This is a GDAL issue. I don't think it is explicitly specified if the library call version of the utilities return with a flushed output. Generally, no GDAL method flushes by default - so translate must be the exception. It is not impossible that even with translate it actually depends on the output format. Generally, you should not expect the output to be flushed as GDAL is caching extensively.