Closed snehashis-roy closed 1 year ago
bioformats2raw/raw2ometiff are routinely used with floating point data, so I'd be a little surprised if there is an fundamental issue with that data type.
./raw2ometiff --compression uncompressed --debug --max-workers=12 -p $TMPDIR $OUTPUT
I would not expect this command to run at all. --max-workers=12
is not a valid option, but --max_workers=12
(the same option name as bioformats2raw) should work instead. Is this the exact command that produced the debug log?
I am using 0.6.0 and 0.4.0
bioformats2raw 0.6.1 and raw2ometiff 0.4.1 are the latest versions; I don't know of a specific change that might solve this problem, but it may be worth upgrading to see if that helps.
You are right, it was indeed a typo in my script, --max-workers should have been --max_workers. However, --max-workers still works for raw2ometiff 0.4.0 and 0.4.1.
Here is my test 4D 32bit image that is not being converted by raw2ometiff (0.4.1 & 0.4.0), bioformats2raw can successfully convert it though. The image opens fine on Fiji.
https://hpc.nih.gov/~roys5/Fos_heatmap_51623_B.tif
I used the following commands,
./bioformats2raw-0.6.1/bin/bioformats2raw -c null -h 1024 -w 1024 --max_workers 12 --no-nested -r 2 --target-min-size=512 -p "Fos_heatmap_51623_B.tif" ./tmp.suFH2ojHgm
./raw2ometiff-0.4.1/bin/raw2ometiff --compression uncompressed --max_workers=12 -p ./tmp.suFH2ojHgm Fos_heatmap_51623_B.ome.tiff
The last step of raw2ometiff hangs, but a potentially corrupt OMETIFF file is generated. tiffinfo on the corrupt image shows this,
TIFFFetchDirectory: Sanity check on directory count failed, zero tag directories not supported.
TIFFReadDirectory: Failed to read directory at offset 16.
Thanks for the additional details, @snehashis-roy.
Unfortunately, I so far haven't been able to reproduce this issue using Fos_heatmap_51623_B.tif
; both bioformats2raw and raw2ometiff as described in https://github.com/glencoesoftware/raw2ometiff/issues/104#issuecomment-1561434066 complete without error. The runtime for each is under a minute, and the resulting OME-TIFF is valid.
If you can capture a jstack when you see raw2ometiff get stuck, that may help us to understand what's going wrong. To do that, start the raw2ometiff command as before. In a separate terminal, run:
$ jps
...
8734 PyramidFromDirectoryWriter
That prints a list of running Java processes; look for the line with PyramidFromDirectoryWriter
as above and note the number (the PID). When you see raw2ometiff
get stuck, run in a separate terminal:
$ jstack 8734 > raw2ometiff-jstack.txt
replacing 8734
with the PID you noted previously. Please attach the raw2ometiff-jstack.txt
file to this issue.
A few other things to consider:
Fos_heatmap_51623_B.ome.tiff
) being written to a local disk? If not, does writing to a local disk work?raw2ometiff
writes all of the pixel data first, then has a final step to go back and write all of the metadata. When I do the conversion, this last step takes 30 seconds or so (admittedly not on a fast system). Approximately how much longer than that are you waiting after the last line of debug output is written, before raw2ometiff
is killed?This is the jstack $PID output of stuck raw2ometiff
2023-05-24 18:21:54
Full thread dump OpenJDK 64-Bit Server VM (25.152-b12 mixed mode):
"Attach Listener" #27 daemon prio=9 os_prio=0 tid=0x00002aab74001000 nid=0x11984 waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"Java2D Disposer" #22 daemon prio=10 os_prio=0 tid=0x00002aaab1182800 nid=0x11843 in Object.wait() [0x00002aab733b3000]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on <0x00000005d268d0b8> (a java.lang.ref.ReferenceQueue$Lock)
at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:143)
- locked <0x00000005d268d0b8> (a java.lang.ref.ReferenceQueue$Lock)
at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:164)
at sun.java2d.Disposer.run(Disposer.java:148)
at java.lang.Thread.run(Thread.java:745)
"ProgressBar" #21 daemon prio=5 os_prio=0 tid=0x00002aaab108c000 nid=0x11842 waiting on condition [0x00002aab5f81e000]
java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x0000000080287dd0> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:1081)
at java.util.concurrent.ScheduledThreadPoolExecautor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:809)
at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1067)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1127)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
"Service Thread" #17 daemon prio=9 os_prio=0 tid=0x00002aaab0101000 nid=0x1183d runnable [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"C1 CompilerThread11" #16 daemon prio=9 os_prio=0 tid=0x00002aaab00fd800 nid=0x1183c waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"C1 CompilerThread10" #15 daemon prio=9 os_prio=0 tid=0x00002aaab00fb800 nid=0x1183b waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"C1 CompilerThread9" #14 daemon prio=9 os_prio=0 tid=0x00002aaab00f9800 nid=0x1183a waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"C1 CompilerThread8" #13 daemon prio=9 os_prio=0 tid=0x00002aaab00f7000 nid=0x11839 waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"C2 CompilerThread7" #12 daemon prio=9 os_prio=0 tid=0x00002aaab00f5000 nid=0x11838 waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"C2 CompilerThread6" #11 daemon prio=9 os_prio=0 tid=0x00002aaab00f2800 nid=0x11837 waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"C2 CompilerThread5" #10 daemon prio=9 os_prio=0 tid=0x00002aaab00f0800 nid=0x11836 waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"C2 CompilerThread4" #9 daemon prio=9 os_prio=0 tid=0x00002aaab00e6000 nid=0x11835 waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"C2 CompilerThread3" #8 daemon prio=9 os_prio=0 tid=0x00002aaab00e4000 nid=0x11834 waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"C2 CompilerThread2" #7 daemon prio=9 os_prio=0 tid=0x00002aaab00e2000 nid=0x11833 waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"C2 CompilerThread1" #6 daemon prio=9 os_prio=0 tid=0x00002aaab00e0000 nid=0x11832 waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"C2 CompilerThread0" #5 daemon prio=9 os_prio=0 tid=0x00002aaab00dd000 nid=0x11831 waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"Signal Dispatcher" #4 daemon prio=9 os_prio=0 tid=0x00002aaab00db800 nid=0x11830 runnable [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"Finalizer" #3 daemon prio=8 os_prio=0 tid=0x00002aaab00a8000 nid=0x1182f in Object.wait() [0x00002aab5cd0f000]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on <0x0000000080214420> (a java.lang.ref.ReferenceQueue$Lock)
at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:143)
- locked <0x0000000080214420> (a java.lang.ref.ReferenceQueue$Lock)
at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:164)
at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:209)
"Reference Handler" #2 daemon prio=10 os_prio=0 tid=0x00002aaab00a3800 nid=0x1182e in Object.wait() [0x00002aab5cc0e000]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on <0x000000008021c3d8> (a java.lang.ref.Reference$Lock)
at java.lang.Object.wait(Object.java:502)
at java.lang.ref.Reference.tryHandlePending(Reference.java:191)
- locked <0x000000008021c3d8> (a java.lang.ref.Reference$Lock)
at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:153)
"main" #1 prio=5 os_prio=0 tid=0x00002aaab000c800 nid=0x11818 runnable [0x00002aaaaabd3000]
java.lang.Thread.State: RUNNABLE
at java.io.RandomAccessFile.setLength(Native Method)
at loci.common.NIOFileHandle.setLength(NIOFileHandle.java:229)
at loci.common.AbstractNIOHandle.validateLength(AbstractNIOHandle.java:87)
at loci.common.NIOFileHandle.writeSetup(NIOFileHandle.java:659)
at loci.common.NIOFileHandle.writeShort(NIOFileHandle.java:605)
at loci.common.RandomAccessOutputStream.writeShort(RandomAccessOutputStream.java:333)
at loci.formats.tiff.TiffSaver.writeIFDValue(TiffSaver.java:580)
at loci.formats.tiff.TiffSaver.writeIFD(TiffSaver.java:535)
at com.glencoesoftware.pyramid.PyramidFromDirectoryWriter.writeIFD(PyramidFromDirectoryWriter.java:1226)
at com.glencoesoftware.pyramid.PyramidFromDirectoryWriter.writeIFDs(PyramidFromDirectoryWriter.java:988)
at com.glencoesoftware.pyramid.PyramidFromDirectoryWriter.convertToPyramid(PyramidFromDirectoryWriter.java:822)
at com.glencoesoftware.pyramid.PyramidFromDirectoryWriter.call(PyramidFromDirectoryWriter.java:238)
at com.glencoesoftware.pyramid.PyramidFromDirectoryWriter.call(PyramidFromDirectoryWriter.java:99)
at picocli.CommandLine.executeUserObject(CommandLine.java:1783)
at picocli.CommandLine.access$900(CommandLine.java:145)
at picocli.CommandLine$RunLast.executeUserObjectOfLastSubcommandWithSameParent(CommandLine.java:2150)
at picocli.CommandLine$RunLast.handle(CommandLine.java:2144)
at picocli.CommandLine$RunLast.handle(CommandLine.java:2108)
at picocli.CommandLine$AbstractParseResultHandler.handleParseResult(CommandLine.java:1968)
at picocli.CommandLine.parseWithHandlers(CommandLine.java:2349)
at picocli.CommandLine.parseWithHandler(CommandLine.java:2284)
at picocli.CommandLine.call(CommandLine.java:2560)
at com.glencoesoftware.pyramid.PyramidFromDirectoryWriter.main(PyramidFromDirectoryWriter.java:213)
"VM Thread" os_prio=0 tid=0x00002aaab009b800 nid=0x1182d runnable
"GC task thread#0 (ParallelGC)" os_prio=0 tid=0x00002aaab0021800 nid=0x11819 runnable
"GC task thread#1 (ParallelGC)" os_prio=0 tid=0x00002aaab0023000 nid=0x1181a runnable
"GC task thread#2 (ParallelGC)" os_prio=0 tid=0x00002aaab0025000 nid=0x1181b runnable
"GC task thread#3 (ParallelGC)" os_prio=0 tid=0x00002aaab0027000 nid=0x1181c runnable
"GC task thread#4 (ParallelGC)" os_prio=0 tid=0x00002aaab0028800 nid=0x1181d runnable
"GC task thread#5 (ParallelGC)" os_prio=0 tid=0x00002aaab002a800 nid=0x1181e runnable
"GC task thread#6 (ParallelGC)" os_prio=0 tid=0x00002aaab002c800 nid=0x1181f runnable
"GC task thread#7 (ParallelGC)" os_prio=0 tid=0x00002aaab002e000 nid=0x11820 runnable
"GC task thread#8 (ParallelGC)" os_prio=0 tid=0x00002aaab0030000 nid=0x11821 runnable
"GC task thread#9 (ParallelGC)" os_prio=0 tid=0x00002aaab0032000 nid=0x11822 runnable
"GC task thread#10 (ParallelGC)" os_prio=0 tid=0x00002aaab0033800 nid=0x11823 runnable
"GC task thread#11 (ParallelGC)" os_prio=0 tid=0x00002aaab0035800 nid=0x11824 runnable
"GC task thread#12 (ParallelGC)" os_prio=0 tid=0x00002aaab0037000 nid=0x11825 runnable
"GC task thread#13 (ParallelGC)" os_prio=0 tid=0x00002aaab0039000 nid=0x11826 runnable
"GC task thread#14 (ParallelGC)" os_prio=0 tid=0x00002aaab003b000 nid=0x11827 runnable
"GC task thread#15 (ParallelGC)" os_prio=0 tid=0x00002aaab003c800 nid=0x11828 runnable
"GC task thread#16 (ParallelGC)" os_prio=0 tid=0x00002aaab003e800 nid=0x11829 runnable
"GC task thread#17 (ParallelGC)" os_prio=0 tid=0x00002aaab0040800 nid=0x1182a runnable
"GC task thread#18 (ParallelGC)" os_prio=0 tid=0x00002aaab0042000 nid=0x1182b runnable
"GC task thread#19 (ParallelGC)" os_prio=0 tid=0x00002aaab0044000 nid=0x1182c runnable
"VM Periodic Task Thread" os_prio=0 tid=0x00002aaab0103800 nid=0x1183e waiting on condition
JNI global references: 463
My java -version gives this,
java -version
openjdk version "1.8.0_152-release"
OpenJDK Runtime Environment (build 1.8.0_152-release-1056-b12)
OpenJDK 64-Bit Server VM (build 25.152-b12, mixed mode)
Hello. This slowdown was indeed caused by network storage. On other large 5D images (e.g. 100x100x5x5x10000 in XYCZT), I also observed that the last step is exponentially faster on a local spinning SATA HDDs (~30sec) compared to a network storage (>2hours). The network SAS storage in on an HPC, connected by 10Gbps Mellanox card, where I usually get 9.2Gbps reading/writing files. So the bandwidth is probably not the issue. I am wondering why the last step has such a wide speed speed discrepancy based on storage location. Thank you.
Thanks for the update, @snehashis-roy. Ultimately I think this is down to repeatedly updating the file length as new data is written; we know this can be expensive on network storage, and the final metadata writing step in raw2ometiff doesn't do a great job of minimizing the number of writes to disk. https://github.com/ome/bioformats/issues/3983 documents similar issues.
https://github.com/ome/ome-common-java/issues/78 is likely the long-term solution we'll pursue here.
Hello. Is it possible to convert 32-bit floating point 3D tif images to pyramid OMETIFF for the purpose of using it in OMERO? It seems that bioformats2raw can convert to zarr in a folder, but raw2ometiff gets stuck.
This is my script for a 3D float32 image (size 528x456x960 (HxWxD))
This is the end of debug file where raw2ometiff gets stuck,
I am happy to share the image and/or the debug output if it helps. I am using 0.6.0 and 0.4.0