Input a TIFF from local buffer without uploading file

Ottozz commented 5 months ago

Hello, first of all thanks for the great tool! I'm having success using it as shown in the demo, by uploading a file in an <input type='file'> tag which basically generates an instance of a File object. The Tiff opens up instantly even if it is quite heavy (~220MB).

My use case now is to read the same files from a local folder (in an electron app), without uploading the files, but reading them through Node's fs or through Sharp's .toBuffer() method. I'm struggling to make it work with decent performance, since the library only accepts an instance of File object or an Url.

After several tries and conversions (buffers, arrayBuffers, Files, blob,..) I was able to (quite) get it working by doing something like this:

In Main Process:

const buffer = await sharp(sharpInput, {limitInputPixels:false}).tiff({tile:true, pyramid:true}).toBuffer();
return buffer.toString('base64');

in Renderer Process:

/* call to Sharp to generate tiled TIFF and get the base64 string */
const base64 = await window.electron.send("build-tiled-tif", filePath); //this takes ~1.5 seconds

/* convert base64 to blob */
const blob = b64toBlob(base64, "image/tiff"); //this takes ~5 seconds

/* create File instance to be passed to GeoTIFFTileSource */
const file = new File([blob], "TEST.tif"); //this takes ~0.3 seconds

/* call to GeoTIFFTileSource and open viewer */
const tiffTileSources = await OpenSeadragon.GeoTIFFTileSource.getAllTileSources(file);
viewer.open(tiffTileSources);

The same applies by replacing Sharp call with something like:

const buffer = await fs.readFile(filePath);
return buffer.toString('base64');

As you can see, the performances are not as great as uploading the same file in the <input type='file'> tag. (almost instant vs ~7seconds). Am I overcompicating things here? Is there a better way to do it?

pearcetm commented 5 months ago

I'm not surprised doing it this way in Electron takes a long time - you basically are having to load the entire giant file into memory before doing anything else. The beauty of GeoTIFF.js is that it only requests and reads specific locations within the file; the GeoTIFFTileSource makes use of that to request specifically the image data needed for the current view. So, a whole lot less data is read at once.

A couple of thoughts: 1) Nothing is actually uploaded in the demo when you use <input type='file'> - it just reads the file object locally. So, you could load files using the same input type on the renderer side of your electron app and it should work just the same as in the demo. Of course, this doesn't let you do anything to pick files programmatically... 2) Because of the requirement for interprocess communication in Electron, it won't be straightforward to pass an object created in the main process over to your renderer. 3) However, you could set up your main process to essentially serve the files equivalently to a GeoTIFF server. One way could be to use GeoTIFF.js (https://geotiffjs.github.io/) on the Node side to read the specific portions of the file and only send those back to the renderer. You'd have to adapt the client side code (i.e. this library) to work with that however. Or, you could somehow set your main process up to support HTTP range requests and forward those somehow... I'm just brainstorming here. You'd definitely have quite a bit of work to do to make it happen.

I guess in the short term, I'd recommend having users pick files to view using an <input type='file'> element directly in your renderer, unless there's a compelling reason this would break your app. Again, nothing is actually uploaded.

Ottozz commented 5 months ago

Hello @pearcetm, thanks for the reply! I see, it actually does not make much sense to load all of the file into memory. My use case is actually very specific and I need to show several Tif files stored in a local folder one after the other as the user goes up and down with the arrow keys (or by manually clicking). Letting the user select each file every time is not viable unfortunately.

I had better luck loading the File objects in a keyval IndexedDB , using it as a sort of cache, allowing to open them almost instantly. But I ended up dropping this approach as it is not really a good solution when having many (50+) large files (~ 80-200 MB) .

In the end I was able to overcome the performance issue (i.e. slowness) by modifying the library and changing the geoTiff method invoked if the input is not an instance of a File from fromUrl() to fromArrayBuffer(). Files large ~200MB now takes 1.5 to 2.5 seconds to open.

Main Process:

return await sharp(sharpInput, {limitInputPixels:false}).tiff({tile:true, pyramid:true}).toBuffer();

Renderer Process:

const updateScanImage = async (filePath) => {
  /* close the viewer */
  viewer.close();

  /* Build tiled tiff */
  const typedArray = await window.electron.sharp("build-tiled-tif", filePath);
  const arrayBuffer = typedArray.buffer.slice(typedArray.byteOffset, typedArray.byteLength + typedArray.byteOffset);

  /* generate tiffTileSource and open viewer */
  const tiffTileSources = await OpenSeadragon.GeoTIFFTileSource.getAllTileSources(arrayBuffer);
  viewer.open(tiffTileSources);
}

However I now face Electron's white screen of death (renderer crash) after opening different files (5/6/7...it vary a bit), even if I'm closing the viewer every time a new image is selected. I investigated on possible memory leaks and resource usage of the app (I used this as base for the debug) but didn't find anything relevant.

I guess I should just give up an investigate further on your proposed solution # 3.

Ottozz commented 4 months ago

As a follow up, I kept investigating the issue and believe that the crash is caused by the resources not beign deallocated when the imageTileSource is replaced (after viewer.close()). This causes a massive spike in memory consumption as new images are loaded and leads to the crash of the renderer process after a certain treshold is reached (see below image for reference).

Not totally sure if it is something strictly related to this library or to OpenSeaDragon itself, but maybe it's worth investigating a bit further.

pearcetm commented 4 months ago

Interesting. I also am not sure whether this is due to something here in this library, or in OpenSeadragon (I see you've opened an issue over there: https://github.com/openseadragon/openseadragon/issues/2531), or with application-specific code which is holding onto a reference somehow.

One thing you might try is pointing the viewer at some TIFFs at remote URLs, like in the demo page. Does memory keep growing when you load images that way? In fact, does the demo page show memory growth if you keep loading and viewing files? That might help narrow things down.

abiswas97 commented 4 months ago

@Ottozz I've faced this spike as well, and I found the geotiff Pool to be the cuplrit. Currently, each call of new GeoTIFFTileSource creates it's own Pool. These do not get reused when new images are opened - each new call to an image generates a new set of Pools, and keeps cascading.

@pearcetm this is also addressed in the PR by making the geotiff pool a static property of the class, instead of being an instance property. Instead of de-allocating and recreating the Pool, I just reuse the existing set.

pearcetm commented 4 months ago

this is also addressed in the PR by making the geotiff pool a static property of the class, instead of being an instance property. Instead of de-allocating and recreating the Pool, I just reuse the existing set.

Thanks for helping identify the problem here. Re-using the existing Pool seems like a fine idea, but I'm a little confused as to why the resources used by a Pool wouldn't get garbage collected as usual if all references to a particular tile source are released.

pearcetm / GeoTIFFTileSource

Input a TIFF from local buffer without uploading file #9