Closed ASHISRAVINDRAN closed 6 years ago
Hi Adrian, Review comments are incorporated and pushed.
Hi @imagejan, since we're on the topic, could you maybe shed some light on one issue I've discovered while testing @ASHISRAVINDRAN's PR.
I've tested the HDF5 export command with big tiff files. I loaded a 20GB tif stack of the shape (548, 3913, 4635)
into Fiji with Bio-Formats and tried to export to HDF5. Fiji was silently crashing unfortunately, so I've passed JVM options to do a heap dump on OOM errors. Got the heap dump, analyzed it and it looks like Fiji really tries to load the whole 20GB into memory. I would expect the ImgPlus
to stream the stack slice-by-slice internally, but instead it creates a list of 10003840 instances of net.imglib2.img.cell.Cell
each containing 1000 shorts (uint16), which is roughly 20GB.
Since I wasn't able to hit (https://github.com/ilastik/ilastik4ij/blob/master/src/main/java/org/ilastik/ilastik4ij/IlastikExport.java#L60) while doing the remote debugging, it seems that this happens internally in one of the pre-processing steps in Fiji, before the command gets executed. I'm pretty sure there is a simple way of streaming the file slice-by-slice, e.g. it can easily be done with the Bio-Formats API:
IFormatReader reader = new ImageReader();
reader.setId("/media/Math-Clinic/brainForSegmentationOneSide.tif");
for (int series = 0; series < reader.getSeriesCount(); series++) {
reader.setSeries(series);
for (int image = 0; image < reader.getImageCount(); image++) {
byte[] bytes = reader.openBytes(image);
...
}
}
Do you know what needs to be done in order to stream the file slice-by-slice using ImgPlus
interface? Thanks
@wolny did you have the image open as a virtual stack, or as a normal image window? It might be an issue with converting between the legacy ImageJ1 structures and your Dataset
input.
I'd assume that the work done by @maarzt in https://github.com/imagej/imagej-legacy/pull/182 could help to improve this.
@imagejan thanks for a quick response. Yes, the tiff was opened as a virtual stack with Bio-Formats
So if I understands you correctly: the current Fiji version is internally converting from legacy ImagePlus
to the new ImgPlus
and that conversion is inefficient... I had a brief look at @maarzt PRs and looks like the VirtualStackAdapter would do the job. So when those changes are in the problem will most likely disappear, right?
@imagejan is there a way to work around this issue in currently? I'm asking cause the problem is that even if the user runs the export with the machine with enough RAM, the internal conversion is really slow and there is no indication to the user that it's happening under the hood (neither in UI nor in the console logs). I took a thread-dump while the conversion was running and it seems that it's also done in a single thread. Here's the thread-dump:
"SciJava-2e720162-Thread-1" #32 prio=5 os_prio=0 tid=0x0000000003629000 nid=0x6098 runnable [0x00007fbb6016b000]
java.lang.Thread.State: RUNNABLE
at net.imagej.legacy.translate.GrayPixelHarmonizer.updateDataset(GrayPixelHarmonizer.java:124)
at net.imagej.legacy.translate.GrayDisplayCreator.makeDataset(GrayDisplayCreator.java:120)
at net.imagej.legacy.translate.AbstractDisplayCreator.getDataset(AbstractDisplayCreator.java:73)
at net.imagej.legacy.translate.GrayDisplayCreator.grayCase(GrayDisplayCreator.java:160)
at net.imagej.legacy.translate.GrayDisplayCreator.makeDisplay(GrayDisplayCreator.java:130)
at net.imagej.legacy.translate.AbstractDisplayCreator.createDisplay(AbstractDisplayCreator.java:64)
at net.imagej.legacy.translate.DefaultImageTranslator.createDisplay(DefaultImageTranslator.java:101)
at net.imagej.legacy.translate.DefaultImageTranslator.createDisplay(DefaultImageTranslator.java:85)
at net.imagej.legacy.LegacyImageMap.registerLegacyImage(LegacyImageMap.java:273)
at net.imagej.legacy.display.LegacyImageDisplayService.getActiveImageDisplay(LegacyImageDisplayService.java:119)
at net.imagej.legacy.display.LegacyImageDisplayService.getActiveDataset(LegacyImageDisplayService.java:130)
at net.imagej.display.process.ActiveDatasetPreprocessor.getValue(ActiveDatasetPreprocessor.java:72)
at net.imagej.display.process.ActiveDatasetPreprocessor.getValue(ActiveDatasetPreprocessor.java:56)
at net.imagej.display.process.SingleInputPreprocessor.process(SingleInputPreprocessor.java:85)
at org.scijava.module.ModuleRunner.preProcess(ModuleRunner.java:105)
at org.scijava.module.ModuleRunner.run(ModuleRunner.java:157)
at org.scijava.module.ModuleRunner.call(ModuleRunner.java:127)
at org.scijava.module.ModuleRunner.call(ModuleRunner.java:66)
at org.scijava.thread.DefaultThreadService$3.call(DefaultThreadService.java:238)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Sorry, I fear I'll be not of much help here. Is it required to have the image open as virtual stack? Or could you simply get the file location (either as a @Parameter File
, or by asking the VirtualStack
for its backing file(s)) and then use DatasetService
or IOService
to get the ImgPlus from disk, to avoid the legacy environment? It might require to explicitly use ImgLib2s
Volatile*` structures however. I really don't know. Others might be able to advise, maybe you can open a discussion on the forum?
Please also see this related forum discussion: Wrap ImagePlus VirtualStack into imglib2
We are actively working on such issues at the currently ongoing SciView hackathon. I am reviewing the relevant ImageJ Legacy PRs today and tomorrow, and hope to get them merged and released before next Thursday. @wolny It would be awesome if you could test your workflow again with the new versions. Could you post here the exact steps you took that result in the issue? We can try to reproduce here as well.
As a workaround for the moment, you could try opening the data using File > Import > Image... which uses SCIFIO, and hence opens as ImageJ2 data structures directly. I am not sure whether this will avoid the problem, but worth a try.
@imagejan @ctrueden thanks for the useful tips. Turns out that @ctrueden's suggestion, i.e. loading the tiff as File > Import > Image...
works as expected (there is no in-memory conversion happening).
Just for completeness, previously I was loading the 20GB tiff file (tzyxc: 1x548x4635x3913x1) with Plugins > Bio-Formats > Bio-Formats Importer
and View stack with: Data Browser; Memory Management: Use virtual stack
. Then Plugins > ilastik > Export HDF5
which was causing the in-memory conversion of the whole stack via net.imagej.legacy.display.LegacyImageDisplayService
crashing my Fiji app.
Very glad to hear the SCIFIO approach works for you, @wolny.
Sorry that we didn't get the ImageJ Legacy improvements merged before the end of the hackathon. @tpietzsch and @maarzt are working on it—see imglib/imglib2-ij#12, which I anticipate will be merged soon. Then we just need imagej/imagej-legacy#182, which I will do my best to get reviewed + merged ASAP after June 11.
In the mean time, brave souls could try building these PRs from source (it's as easy as mvn
from the command line while on the correct branch) and giving them a spin.
Thanks for letting us know @ctrueden. We'll keep monitoring those PRs and ideally build them ourselves and give them a go in order to check the interoperability with BioFormats and our plugin. On another note @ASHISRAVINDRAN discovered some issues with SCIFIO approach while exporting 'big' tiff files via our plugin. This is related to our discussion but outside of the scope of this particular PR, so we'd double-check his finding and discuss it in a separate PR/issue.
Hi Adrian, As discussed here is the pull request for your reference. I am still in pursuit to check if the contents of the image are correctly converted to ARGB/written as HDF5.