glencoesoftware / bioformats2raw

Bio-Formats image file format to raw format converter
GNU General Public License v2.0
76 stars 34 forks source link

conversion fails midway through big h5 file #187

Closed will-moore closed 1 year ago

will-moore commented 1 year ago

I'm trying to convert https://idr.openmicroscopy.org/webclient/?show=image-9846152 (requested at https://forum.image.sc/t/converting-other-idr-images-into-public-zarr/44025/13) to OME-NGFF.

Working on pilot-zarr1-dev I have downloaded the files to /data/idr0048 with

rsync -av  -e "ssh -A idr-pilot.openmicroscopy.org ssh" "idrtesting-omeroreadwrite:/uod/idr/filesets/idr0048-abdeladim-chroms/20191119-ftp/" -P .
(base) [wmoore@pilot-zarr1-dev idr0048]$ ls -alh
total 48G
...
-rw-rw----. 1 wmoore idrnfs    48G Nov 19  2019 WB.h5
-rw-rw----. 1 wmoore idrnfs   2.3K Nov 19  2019 WB.xml
-rw-rw-r--. 1 wmoore wmoore   8.2K Jan 12 16:45 .WB.xml.bfmemo

Checked that the downloaded files match what's on IDR:

$ cksum
3758472695 51049420472 WB.h5
742400829 2334 WB.xml

Running in a terminal screen, I see:

$ bioformats2raw --version
Version = 0.5.0
Bio-Formats version = 6.10.1
NGFF specification version = 0.4

$ bioformats2raw WB.xml 9846152_logging.zarr --debug > /home/wmoore/bf2raw.log
OpenJDK 64-Bit Server VM warning: You have loaded library /tmp/opencv_openpnp2269157708896801230/nu/pattern/opencv/linux/x86_64/libopencv_java342.so which might have disabled stack guard. The VM will try to fix the stack guard now.
It's highly recommended that you fix the library with 'execstack -c <libfile>', or link it with '-z noexecstack'.
Killed

The 3 times I have tried this, I get about the same number of files generated:

$ find 9846152.zarr/ -type f | wc -l
18204
$ find 9846152_a.zarr/ -type f | wc -l
18207
$ find 9846152_logging.zarr/ -type f | wc -l
18199
$ cat 9846152.zarr/.zattrs ``` { "bioformats2raw.layout" : 3 }(base) [wmoore@pilot-zarr1-dev idr0048]$ cat 9846152.zarr/0/.zattrs { "multiscales" : [ { "metadata" : { "method" : "loci.common.image.SimpleImageScaler", "version" : "Bio-Formats 6.10.1" }, "axes" : [ { "name" : "t", "type" : "time" }, { "name" : "c", "type" : "channel" }, { "unit" : "pixel", "name" : "z", "type" : "space" }, { "unit" : "pixel", "name" : "y", "type" : "space" }, { "unit" : "pixel", "name" : "x", "type" : "space" } ], "name" : "P_t00000, W_s00_0", "datasets" : [ { "path" : "0", "coordinateTransformations" : [ { "scale" : [ 1.0, 1.0, 1.0, 1.0, 1.0 ], "type" : "scale" } ] }, { "path" : "1", "coordinateTransformations" : [ { "scale" : [ 1.0, 1.0, 1.0, 2.0, 2.0 ], "type" : "scale" } ] }, { "path" : "2", "coordinateTransformations" : [ { "scale" : [ 1.0, 1.0, 1.0, 4.0, 4.0 ], "type" : "scale" } ] }, { "path" : "3", "coordinateTransformations" : [ { "scale" : [ 1.0, 1.0, 1.0, 8.0, 8.0 ], "type" : "scale" } ] }, { "path" : "4", "coordinateTransformations" : [ { "scale" : [ 1.0, 1.0, 1.0, 16.0, 16.0 ], "type" : "scale" } ] }, { "path" : "5", "coordinateTransformations" : [ { "scale" : [ 1.0, 1.0, 1.0, 32.0, 32.0 ], "type" : "scale" } ] }, { "path" : "6", "coordinateTransformations" : [ { "scale" : [ 1.0, 1.0, 1.0, 64.0, 64.0 ], "type" : "scale" } ] }, { "path" : "7", "coordinateTransformations" : [ { "scale" : [ 1.0, 1.0, 1.0, 128.0, 128.0 ], "type" : "scale" } ] } ], "version" : "0.4" } ], "omero" : { "channels" : [ { "color" : "FF0000", "coefficient" : 1, "active" : true, "label" : "Channel 0", "window" : { "min" : 0.0, "max" : 65535.0, "start" : 0.0, "end" : 65535.0 }, "family" : "linear", "inverted" : false }, { "color" : "00FF00", "coefficient" : 1, "active" : true, "label" : "Channel 1", "window" : { "min" : 0.0, "max" : 65535.0, "start" : 0.0, "end" : 65535.0 }, "family" : "linear", "inverted" : false }, { "color" : "0000FF", "coefficient" : 1, "active" : true, "label" : "Channel 2", "window" : { "min" : 0.0, "max" : 65535.0, "start" : 0.0, "end" : 65535.0 }, "family" : "linear", "inverted" : false } ], "rdefs" : { "defaultT" : 0, "model" : "color", "defaultZ" : 45 } } } ```

but only the highest resolution directory exists (and looks incomplete):

$ ls 9846152.zarr/0/
0

The log file looks like this (middle repeated section not included):

bf2raw.log ``` 2023-01-17 13:16:21,753 [main] DEBUG loci.formats.ClassList - Could not find loci.formats.in.URLReader java.lang.ClassNotFoundException: loci.formats.in.URLReader at java.net.URLClassLoader.findClass(URLClassLoader.java:381) at java.lang.ClassLoader.loadClass(ClassLoader.java:424) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331) at java.lang.ClassLoader.loadClass(ClassLoader.java:357) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:264) at loci.formats.ClassList.parseLine(ClassList.java:196) at loci.formats.ClassList.parseFile(ClassList.java:258) at loci.formats.ClassList.(ClassList.java:138) at loci.formats.ClassList.(ClassList.java:122) at loci.formats.ImageReader.getDefaultReaderClasses(ImageReader.java:80) at com.glencoesoftware.bioformats2raw.Converter.getBaseReaderClass(Converter.java:2105) at com.glencoesoftware.bioformats2raw.Converter.convert(Converter.java:538) at com.glencoesoftware.bioformats2raw.Converter.call(Converter.java:514) at com.glencoesoftware.bioformats2raw.Converter.call(Converter.java:109) at picocli.CommandLine.executeUserObject(CommandLine.java:1953) at picocli.CommandLine.access$1300(CommandLine.java:145) at picocli.CommandLine$RunLast.executeUserObjectOfLastSubcommandWithSameParent(CommandLine.java:2352) at picocli.CommandLine$RunLast.handle(CommandLine.java:2346) at picocli.CommandLine$RunLast.handle(CommandLine.java:2311) at picocli.CommandLine$AbstractParseResultHandler.handleParseResult(CommandLine.java:2172) at picocli.CommandLine.parseWithHandlers(CommandLine.java:2550) at picocli.CommandLine.parseWithHandler(CommandLine.java:2485) at picocli.CommandLine.call(CommandLine.java:2761) at com.glencoesoftware.bioformats2raw.Converter.main(Converter.java:2274) 2023-01-17 13:16:21,760 [main] DEBUG loci.formats.ClassList - Could not find loci.formats.in.SlideBook6Reader java.lang.ClassNotFoundException: loci.formats.in.SlideBook6Reader at java.net.URLClassLoader.findClass(URLClassLoader.java:381) at java.lang.ClassLoader.loadClass(ClassLoader.java:424) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331) at java.lang.ClassLoader.loadClass(ClassLoader.java:357) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:264) at loci.formats.ClassList.parseLine(ClassList.java:196) at loci.formats.ClassList.parseFile(ClassList.java:258) at loci.formats.ClassList.(ClassList.java:138) at loci.formats.ClassList.(ClassList.java:122) at loci.formats.ImageReader.getDefaultReaderClasses(ImageReader.java:80) at com.glencoesoftware.bioformats2raw.Converter.getBaseReaderClass(Converter.java:2105) at com.glencoesoftware.bioformats2raw.Converter.convert(Converter.java:538) at com.glencoesoftware.bioformats2raw.Converter.call(Converter.java:514) at com.glencoesoftware.bioformats2raw.Converter.call(Converter.java:109) at picocli.CommandLine.executeUserObject(CommandLine.java:1953) at picocli.CommandLine.access$1300(CommandLine.java:145) at picocli.CommandLine$RunLast.executeUserObjectOfLastSubcommandWithSameParent(CommandLine.java:2352) at picocli.CommandLine$RunLast.handle(CommandLine.java:2346) at picocli.CommandLine$RunLast.handle(CommandLine.java:2311) at picocli.CommandLine$AbstractParseResultHandler.handleParseResult(CommandLine.java:2172) at picocli.CommandLine.parseWithHandlers(CommandLine.java:2550) at picocli.CommandLine.parseWithHandler(CommandLine.java:2485) at picocli.CommandLine.call(CommandLine.java:2761) at com.glencoesoftware.bioformats2raw.Converter.main(Converter.java:2274) 2023-01-17 13:16:21,761 [main] DEBUG loci.formats.ClassList - Could not find loci.formats.in.ScreenReader java.lang.ClassNotFoundException: loci.formats.in.ScreenReader at java.net.URLClassLoader.findClass(URLClassLoader.java:381) at java.lang.ClassLoader.loadClass(ClassLoader.java:424) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331) at java.lang.ClassLoader.loadClass(ClassLoader.java:357) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:264) at loci.formats.ClassList.parseLine(ClassList.java:196) at loci.formats.ClassList.parseFile(ClassList.java:258) at loci.formats.ClassList.(ClassList.java:138) at loci.formats.ClassList.(ClassList.java:122) at loci.formats.ImageReader.getDefaultReaderClasses(ImageReader.java:80) at com.glencoesoftware.bioformats2raw.Converter.getBaseReaderClass(Converter.java:2105) at com.glencoesoftware.bioformats2raw.Converter.convert(Converter.java:538) at com.glencoesoftware.bioformats2raw.Converter.call(Converter.java:514) at com.glencoesoftware.bioformats2raw.Converter.call(Converter.java:109) at picocli.CommandLine.executeUserObject(CommandLine.java:1953) at picocli.CommandLine.access$1300(CommandLine.java:145) at picocli.CommandLine$RunLast.executeUserObjectOfLastSubcommandWithSameParent(CommandLine.java:2352) at picocli.CommandLine$RunLast.handle(CommandLine.java:2346) at picocli.CommandLine$RunLast.handle(CommandLine.java:2311) at picocli.CommandLine$AbstractParseResultHandler.handleParseResult(CommandLine.java:2172) at picocli.CommandLine.parseWithHandlers(CommandLine.java:2550) at picocli.CommandLine.parseWithHandler(CommandLine.java:2485) at picocli.CommandLine.call(CommandLine.java:2761) at com.glencoesoftware.bioformats2raw.Converter.main(Converter.java:2274) 2023-01-17 13:16:21,762 [main] DEBUG loci.formats.ClassList - Could not find loci.formats.in.ZarrReader java.lang.ClassNotFoundException: loci.formats.in.ZarrReader at java.net.URLClassLoader.findClass(URLClassLoader.java:381) at java.lang.ClassLoader.loadClass(ClassLoader.java:424) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331) at java.lang.ClassLoader.loadClass(ClassLoader.java:357) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:264) at loci.formats.ClassList.parseLine(ClassList.java:196) at loci.formats.ClassList.parseFile(ClassList.java:258) at loci.formats.ClassList.(ClassList.java:138) at loci.formats.ClassList.(ClassList.java:122) at loci.formats.ImageReader.getDefaultReaderClasses(ImageReader.java:80) at com.glencoesoftware.bioformats2raw.Converter.getBaseReaderClass(Converter.java:2105) at com.glencoesoftware.bioformats2raw.Converter.convert(Converter.java:538) at com.glencoesoftware.bioformats2raw.Converter.call(Converter.java:514) at com.glencoesoftware.bioformats2raw.Converter.call(Converter.java:109) at picocli.CommandLine.executeUserObject(CommandLine.java:1953) at picocli.CommandLine.access$1300(CommandLine.java:145) at picocli.CommandLine$RunLast.executeUserObjectOfLastSubcommandWithSameParent(CommandLine.java:2352) at picocli.CommandLine$RunLast.handle(CommandLine.java:2346) at picocli.CommandLine$RunLast.handle(CommandLine.java:2311) at picocli.CommandLine$AbstractParseResultHandler.handleParseResult(CommandLine.java:2172) at picocli.CommandLine.parseWithHandlers(CommandLine.java:2550) at picocli.CommandLine.parseWithHandler(CommandLine.java:2485) at picocli.CommandLine.call(CommandLine.java:2761) at com.glencoesoftware.bioformats2raw.Converter.main(Converter.java:2274) 2023-01-17 13:16:21,979 [main] DEBUG c.g.bioformats2raw.Converter - Added extra reader: class com.glencoesoftware.bioformats2raw.PyramidTiffReader 2023-01-17 13:16:21,979 [main] DEBUG c.g.bioformats2raw.Converter - Added extra reader: class com.glencoesoftware.bioformats2raw.MiraxReader 2023-01-17 13:16:21,979 [main] DEBUG c.g.bioformats2raw.Converter - Added extra reader: class com.glencoesoftware.bioformats2raw.BioTekReader 2023-01-17 13:16:21,979 [main] DEBUG c.g.bioformats2raw.Converter - Added extra reader: class com.glencoesoftware.bioformats2raw.ND2PlateReader 2023-01-17 13:16:22,023 [main] DEBUG o.s.nativelib.NativeLibraryUtil - processor is INTEL_64 os.arch is amd64 2023-01-17 13:16:22,023 [main] DEBUG o.s.nativelib.NativeLibraryUtil - architecture is LINUX_64 os.name is linux 2023-01-17 13:16:22,025 [main] DEBUG o.s.nativelib.NativeLibraryUtil - architecture is LINUX_64 os.name is linux 2023-01-17 13:16:22,025 [main] DEBUG o.s.nativelib.NativeLibraryUtil - platform specific path is META-INF/lib/linux_64/ 2023-01-17 13:16:22,025 [main] DEBUG o.scijava.nativelib.BaseJniExtractor - mappedLib is libturbojpeg.so 2023-01-17 13:16:22,025 [main] DEBUG o.scijava.nativelib.BaseJniExtractor - URL is jar:file:/home/wmoore/miniconda3/share/bioformats2raw-0.5.0-0/lib/turbojpeg-6.10.1.jar!/META-INF/lib/linux_64/libturbojpeg.so 2023-01-17 13:16:22,025 [main] DEBUG o.scijava.nativelib.BaseJniExtractor - URL path is file:/home/wmoore/miniconda3/share/bioformats2raw-0.5.0-0/lib/turbojpeg-6.10.1.jar!/META-INF/lib/linux_64/libturbojpeg.so 2023-01-17 13:16:22,026 [main] DEBUG o.scijava.nativelib.BaseJniExtractor - Extracting 'jar:file:/home/wmoore/miniconda3/share/bioformats2raw-0.5.0-0/lib/turbojpeg-6.10.1.jar!/META-INF/lib/linux_64/libturbojpeg.so' to '/tmp/libturbojpeg5255053282570603788.so' 2023-01-17 13:16:22,038 [main] DEBUG loci.common.NIOByteBufferProvider - Using mapped byte buffer? false 2023-01-17 13:16:22,052 [main] INFO loci.formats.ImageReader - BDVReader initializing WB.xml 2023-01-17 13:16:22,053 [main] DEBUG loci.formats.FormatHandler - BDVReader initializing WB.xml 2023-01-17 13:16:22,053 [main] DEBUG loci.formats.FormatHandler - loci.formats.in.BDVReader.initFile(WB.xml) 2023-01-17 13:16:22,075 [main] DEBUG loci.common.services.ServiceFactory - Loaded properties from: services.properties 2023-01-17 13:16:22,076 [main] DEBUG loci.common.services.ServiceFactory - Added interface interface loci.formats.services.POIService and implementation class loci.formats.services.POIServiceImpl 2023-01-17 13:16:22,077 [main] DEBUG loci.common.services.ServiceFactory - Added interface interface loci.formats.services.MDBService and implementation class loci.formats.services.MDBServiceImpl 2023-01-17 13:16:22,077 [main] DEBUG loci.common.services.ServiceFactory - Added interface interface loci.formats.services.JPEGTurboService and implementation class loci.formats.services.JPEGTurboServiceImpl 2023-01-17 13:16:22,077 [main] DEBUG loci.common.services.ServiceFactory - Added interface interface ome.codecs.services.LuraWaveService and implementation class ome.codecs.services.LuraWaveServiceImpl 2023-01-17 13:16:22,078 [main] DEBUG loci.common.services.ServiceFactory - Added interface interface loci.common.services.S3ClientService and implementation class loci.common.services.S3ClientServiceImpl 2023-01-17 13:16:22,078 [main] DEBUG loci.common.services.ServiceFactory - Added interface interface loci.formats.services.JAIIIOService and implementation class loci.formats.services.JAIIIOServiceImpl 2023-01-17 13:16:22,079 [main] DEBUG loci.common.services.ServiceFactory - Added interface interface loci.formats.services.WlzService and implementation class loci.formats.services.WlzServiceImpl 2023-01-17 13:16:22,080 [main] DEBUG loci.common.services.ServiceFactory - Added interface interface loci.formats.services.JHDFService and implementation class loci.formats.services.JHDFServiceImpl 2023-01-17 13:16:22,080 [main] DEBUG loci.common.services.ServiceFactory - Added interface interface loci.formats.services.NetCDFService and implementation class loci.formats.services.NetCDFServiceImpl 2023-01-17 13:16:22,080 [main] DEBUG loci.common.services.ServiceFactory - Added interface interface loci.formats.services.EXIFService and implementation class loci.formats.services.EXIFServiceImpl 2023-01-17 13:16:22,081 [main] DEBUG loci.common.services.ServiceFactory - Added interface interface loci.formats.services.MetakitService and implementation class loci.formats.services.MetakitServiceImpl 2023-01-17 13:16:22,081 [main] DEBUG loci.common.services.ServiceFactory - Added interface interface loci.formats.services.LuraWaveService and implementation class loci.formats.services.LuraWaveServiceImpl 2023-01-17 13:16:22,082 [main] DEBUG loci.common.services.ServiceFactory - Added interface interface loci.formats.services.OMEXMLService and implementation class loci.formats.services.OMEXMLServiceImpl 2023-01-17 13:16:22,083 [main] DEBUG loci.common.services.ServiceFactory - Added interface interface ome.codecs.services.JAIIIOService and implementation class ome.codecs.services.JAIIIOServiceImpl 2023-01-17 13:16:22,083 [main] DEBUG loci.common.services.ServiceFactory - Added interface interface loci.formats.services.JPEGXRService and implementation class loci.formats.services.JPEGXRServiceImpl 2023-01-17 13:16:22,244 [main] INFO loci.formats.FormatHandler - Timepoint :t00000 2023-01-17 13:16:22,244 [main] INFO loci.formats.FormatHandler - Setup :t00000/s00 2023-01-17 13:16:22,244 [main] INFO loci.formats.FormatHandler - Site :t00000/s00/0 2023-01-17 13:16:22,245 [main] INFO loci.formats.FormatHandler - Site :t00000/s00/1 2023-01-17 13:16:22,245 [main] INFO loci.formats.FormatHandler - Site :t00000/s00/2 2023-01-17 13:16:22,245 [main] INFO loci.formats.FormatHandler - Site :t00000/s00/3 2023-01-17 13:16:22,245 [main] INFO loci.formats.FormatHandler - Site :t00000/s00/4 2023-01-17 13:16:22,245 [main] INFO loci.formats.FormatHandler - Site :t00000/s00/5 2023-01-17 13:16:22,245 [main] INFO loci.formats.FormatHandler - Site :t00000/s00/6 2023-01-17 13:16:22,245 [main] INFO loci.formats.FormatHandler - Site :t00000/s00/7 2023-01-17 13:16:22,245 [main] INFO loci.formats.FormatHandler - Setup :t00000/s01 2023-01-17 13:16:22,246 [main] INFO loci.formats.FormatHandler - Site :t00000/s01/0 2023-01-17 13:16:22,246 [main] INFO loci.formats.FormatHandler - Site :t00000/s01/1 2023-01-17 13:16:22,246 [main] INFO loci.formats.FormatHandler - Site :t00000/s01/2 2023-01-17 13:16:22,246 [main] INFO loci.formats.FormatHandler - Site :t00000/s01/2 2023-01-17 13:16:22,246 [main] INFO loci.formats.FormatHandler - Site :t00000/s01/3 2023-01-17 13:16:22,246 [main] INFO loci.formats.FormatHandler - Site :t00000/s01/4 2023-01-17 13:16:22,246 [main] INFO loci.formats.FormatHandler - Site :t00000/s01/5 2023-01-17 13:16:22,246 [main] INFO loci.formats.FormatHandler - Site :t00000/s01/6 2023-01-17 13:16:22,246 [main] INFO loci.formats.FormatHandler - Site :t00000/s01/7 2023-01-17 13:16:22,246 [main] INFO loci.formats.FormatHandler - Setup :t00000/s02 2023-01-17 13:16:22,247 [main] INFO loci.formats.FormatHandler - Site :t00000/s02/0 2023-01-17 13:16:22,247 [main] INFO loci.formats.FormatHandler - Site :t00000/s02/1 2023-01-17 13:16:22,247 [main] INFO loci.formats.FormatHandler - Site :t00000/s02/2 2023-01-17 13:16:22,247 [main] INFO loci.formats.FormatHandler - Site :t00000/s02/3 2023-01-17 13:16:22,247 [main] INFO loci.formats.FormatHandler - Site :t00000/s02/4 2023-01-17 13:16:22,247 [main] INFO loci.formats.FormatHandler - Site :t00000/s02/5 2023-01-17 13:16:22,247 [main] INFO loci.formats.FormatHandler - Site :t00000/s02/6 2023-01-17 13:16:22,247 [main] INFO loci.formats.FormatHandler - Site :t00000/s02/7 2023-01-17 13:16:23,221 [main] DEBUG loci.formats.FormatHandler - t00000/s00/0/cells/ 2023-01-17 13:16:23,495 [main] DEBUG loci.formats.FormatHandler - t00000/s00/1/cells/ 2023-01-17 13:16:23,551 [main] DEBUG loci.formats.FormatHandler - t00000/s00/2/cells/ 2023-01-17 13:16:23,567 [main] DEBUG loci.formats.FormatHandler - t00000/s00/3/cells/ 2023-01-17 13:16:23,572 [main] DEBUG loci.formats.FormatHandler - t00000/s00/4/cells/ 2023-01-17 13:16:23,574 [main] DEBUG loci.formats.FormatHandler - t00000/s00/5/cells/ 2023-01-17 13:16:23,574 [main] DEBUG loci.formats.FormatHandler - t00000/s00/6/cells/ 2023-01-17 13:16:23,575 [main] DEBUG loci.formats.FormatHandler - t00000/s00/7/cells/ 2023-01-17 13:34:01,840 [main] DEBUG loci.common.Location - Location.mapFile: embedded-stream.raw -> null 2023-01-17 13:34:01,842 [main] DEBUG loci.common.Location - Location.mapFile: embedded-stream.raw -> null 2023-01-17 13:34:02,047 [main] DEBUG loci.formats.Memoizer - loaded memo file: /data/idr0048/.WB.xml.bfmemo (8316 bytes) 2023-01-17 13:34:02,048 [main] DEBUG loci.formats.Memoizer - start[1673962441924] time[123] tag[loci.formats.Memoizer.loadMemo] 2023-01-17 13:34:02,048 [main] DEBUG loci.formats.Memoizer - start[1673962441897] time[151] tag[loci.formats.Memoizer.setId] 2023-01-17 13:34:02,057 [main] DEBUG loci.formats.Memoizer - loaded memo file: /data/idr0048/.WB.xml.bfmemo (8316 bytes) 2023-01-17 13:34:02,057 [main] DEBUG loci.formats.Memoizer - start[1673962442050] time[7] tag[loci.formats.Memoizer.loadMemo] 2023-01-17 13:34:02,058 [main] DEBUG loci.formats.Memoizer - start[1673962442049] time[8] tag[loci.formats.Memoizer.setId] 2023-01-17 13:34:02,067 [main] DEBUG loci.formats.Memoizer - loaded memo file: /data/idr0048/.WB.xml.bfmemo (8316 bytes) 2023-01-17 13:34:02,067 [main] DEBUG loci.formats.Memoizer - start[1673962442058] time[8] tag[loci.formats.Memoizer.loadMemo] 2023-01-17 13:34:02,068 [main] DEBUG loci.formats.Memoizer - start[1673962442058] time[9] tag[loci.formats.Memoizer.setId] 2023-01-17 13:34:02,075 [main] DEBUG loci.formats.Memoizer - loaded memo file: /data/idr0048/.WB.xml.bfmemo (8316 bytes) 2023-01-17 13:34:02,075 [main] DEBUG loci.formats.Memoizer - start[1673962442068] time[7] tag[loci.formats.Memoizer.loadMemo] 2023-01-17 13:34:02,076 [main] DEBUG loci.formats.Memoizer - start[1673962442068] time[7] tag[loci.formats.Memoizer.setId] 2023-01-17 13:34:02,510 [main] INFO c.g.bioformats2raw.Converter - Using 8 pyramid resolutions 2023-01-17 13:34:02,510 [main] INFO c.g.bioformats2raw.Converter - Preparing to write pyramid sizeX 19120 (tileWidth: 1024) sizeY 13350 (tileWidth: 1024) sizeZ 91 (tileDepth: 1) imageCount 273 2023-01-17 13:34:02,510 [main] DEBUG c.g.bioformats2raw.Converter - setSeriesLevelMetadata(0, 8) 2023-01-17 13:34:02,511 [main] DEBUG c.g.bioformats2raw.Converter - seriesString = 0 2023-01-17 13:34:02,511 [main] DEBUG c.g.bioformats2raw.Converter - resolutionString = 0/0 2023-01-17 13:34:02,511 [main] DEBUG c.g.bioformats2raw.Converter - creating subgroup 9846152_logging.zarr/0 2023-01-17 13:34:02,532 [main] DEBUG c.g.bioformats2raw.Converter - finished writing subgroup attributes 2023-01-17 13:34:02,620 [pool-1-thread-2] INFO c.g.bioformats2raw.Converter - requesting tile to write at [0, 1, 0, 0, 0] to 0/0 2023-01-17 13:34:02,620 [pool-1-thread-3] INFO c.g.bioformats2raw.Converter - requesting tile to write at [0, 2, 0, 0, 0] to 0/0 2023-01-17 13:34:02,620 [pool-1-thread-1] INFO c.g.bioformats2raw.Converter - requesting tile to write at [0, 0, 0, 0, 0] to 0/0 2023-01-17 13:34:02,620 [pool-1-thread-4] INFO c.g.bioformats2raw.Converter - requesting tile to write at [0, 0, 1, 0, 0] to 0/0 2023-01-17 13:34:03,364 [pool-1-thread-4] INFO c.g.bioformats2raw.Converter - chunk read complete 1/72618 2023-01-17 13:34:03,364 [pool-1-thread-4] INFO org.perf4j.TimingLogger - start[1673962442621] time[743] tag[getChunk] 2023-01-17 13:34:03,619 [pool-1-thread-3] INFO c.g.bioformats2raw.Converter - chunk read complete 2/72618 2023-01-17 13:34:03,619 [pool-1-thread-3] INFO org.perf4j.TimingLogger - start[1673962442621] time[998] tag[getChunk] 2023-01-17 13:34:03,625 [pool-1-thread-4] DEBUG c.g.bioformats2raw.Converter - start[1673962443367] time[257] tag[writeBytes] 2023-01-17 13:34:03,625 [pool-1-thread-4] INFO c.g.bioformats2raw.Converter - Successfully processed chunk; resolution=0 plane=1 xx=0 yy=0 zz=1 width=1024 height=1024 depth=1 2023-01-17 13:34:03,627 [pool-1-thread-4] INFO c.g.bioformats2raw.Converter - requesting tile to write at [0, 1, 1, 0, 0] to 0/0 2023-01-17 13:34:03,643 [pool-1-thread-3] DEBUG c.g.bioformats2raw.Converter - start[1673962443620] time[22] tag[writeBytes] 2023-01-17 13:34:03,643 [pool-1-thread-3] INFO c.g.bioformats2raw.Converter - Successfully processed chunk; resolution=0 plane=182 xx=0 yy=0 zz=0 width=1024 height=1024 depth=1 2023-01-17 13:34:03,644 [pool-1-thread-3] INFO c.g.bioformats2raw.Converter - requesting tile to write at [0, 2, 1, 0, 0] to 0/0 2023-01-17 13:34:03,964 [pool-1-thread-1] INFO c.g.bioformats2raw.Converter - chunk read complete 3/72618 2023-01-17 13:34:03,964 [pool-1-thread-1] INFO org.perf4j.TimingLogger - start[1673962442621] time[1343] tag[getChunk] 2023-01-17 13:34:03,971 [pool-1-thread-2] INFO c.g.bioformats2raw.Converter - chunk read complete 4/72618 2023-01-17 13:34:03,971 [pool-1-thread-2] INFO org.perf4j.TimingLogger - start[1673962442620] time[1350] tag[getChunk] 2023-01-17 13:34:03,981 [pool-1-thread-1] DEBUG c.g.bioformats2raw.Converter - start[1673962443965] time[16] tag[writeBytes] 2023-01-17 13:34:03,981 [pool-1-thread-1] INFO c.g.bioformats2raw.Converter - Successfully processed chunk; resolution=0 plane=0 xx=0 yy=0 zz=0 width=1024 height=1024 depth=1 2023-01-17 13:34:03,983 [pool-1-thread-1] INFO c.g.bioformats2raw.Converter - requesting tile to write at [0, 0, 2, 0, 0] to 0/0 2023-01-17 13:34:03,994 [pool-1-thread-2] DEBUG c.g.bioformats2raw.Converter - start[1673962443972] time[21] tag[writeBytes] 2023-01-17 13:34:03,994 [pool-1-thread-2] INFO c.g.bioformats2raw.Converter - Successfully processed chunk; resolution=0 plane=91 xx=0 yy=0 zz=0 width=1024 height=1024 depth=1 2023-01-17 13:34:03,995 [pool-1-thread-2] INFO c.g.bioformats2raw.Converter - requesting tile to write at [0, 1, 2, 0, 0] to 0/0 2023-01-17 13:34:04,265 [pool-1-thread-3] INFO c.g.bioformats2raw.Converter - chunk read complete 5/72618 ... 2023-01-17 15:20:53,620 [pool-1-thread-2] INFO org.perf4j.TimingLogger - start[1673968850253] time[3366] tag[getChunk] 2023-01-17 15:20:53,641 [pool-1-thread-2] DEBUG c.g.bioformats2raw.Converter - start[1673968853620] time[20] tag[writeBytes] 2023-01-17 15:20:53,641 [pool-1-thread-2] INFO c.g.bioformats2raw.Converter - Successfully processed chunk; resolution=0 plane=147 xx=9216 yy=3072 zz=56 width=1024 height=1024 depth=1 2023-01-17 15:20:53,641 [pool-1-thread-2] INFO c.g.bioformats2raw.Converter - requesting tile to write at [0, 0, 58, 3072, 9216] to 0/0 2023-01-17 15:20:54,994 [pool-1-thread-4] INFO c.g.bioformats2raw.Converter - chunk read complete 18190/72618 2023-01-17 15:20:54,994 [pool-1-thread-4] INFO org.perf4j.TimingLogger - start[1673968850798] time[4196] tag[getChunk] 2023-01-17 15:20:55,017 [pool-1-thread-4] DEBUG c.g.bioformats2raw.Converter - start[1673968854995] time[21] tag[writeBytes] 2023-01-17 15:20:55,017 [pool-1-thread-4] INFO c.g.bioformats2raw.Converter - Successfully processed chunk; resolution=0 plane=238 xx=9216 yy=3072 zz=56 width=1024 height=1024 depth=1 2023-01-17 15:20:55,017 [pool-1-thread-4] INFO c.g.bioformats2raw.Converter - requesting tile to write at [0, 1, 58, 3072, 9216] to 0/0 2023-01-17 15:20:55,899 [pool-1-thread-1] INFO c.g.bioformats2raw.Converter - chunk read complete 18191/72618 2023-01-17 15:20:55,899 [pool-1-thread-1] INFO org.perf4j.TimingLogger - start[1673968852741] time[3157] tag[getChunk] 2023-01-17 15:20:55,935 [pool-1-thread-1] DEBUG c.g.bioformats2raw.Converter - start[1673968855900] time[35] tag[writeBytes] 2023-01-17 15:20:55,935 [pool-1-thread-1] INFO c.g.bioformats2raw.Converter - Successfully processed chunk; resolution=0 plane=239 xx=9216 yy=3072 zz=57 width=1024 height=1024 depth=1 2023-01-17 15:20:55,935 [pool-1-thread-1] INFO c.g.bioformats2raw.Converter - requesting tile to write at [0, 2, 58, 3072, 9216] to 0/0 2023-01-17 15:23:44,928 [pool-1-thread-2] INFO c.g.bioformats2raw.Converter - chunk read complete 18192/72618 2023-01-17 15:23:44,981 [pool-1-thread-2] INFO org.perf4j.TimingLogger - start[1673968853641] time[171339] tag[getChunk] ```
will-moore commented 1 year ago

Using --max_workers=1 doesn't seem to make a difference:

$ bioformats2raw WB.xml 9846152_mw1.zarr --max_workers=1 --debug > bf2raw_mw1.log
OpenJDK 64-Bit Server VM warning: You have loaded library /tmp/opencv_openpnp1435327528824061833/nu/pattern/opencv/linux/x86_64/libopencv_java342.so which might have disabled stack guard. The VM will try to fix the stack guard now.
It's highly recommended that you fix the library with 'execstack -c <libfile>', or link it with '-z noexecstack'.
Killed

$ find 9846152_mw1.zarr/ -type f | wc -l
18211
bf2raw_mw1.log ``` 2023-01-18 11:06:04,036 [main] DEBUG loci.formats.ClassList - Could not find loci.formats.in.URLReader java.lang.ClassNotFoundException: loci.formats.in.URLReader at java.net.URLClassLoader.findClass(URLClassLoader.java:381) at java.lang.ClassLoader.loadClass(ClassLoader.java:424) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331) at java.lang.ClassLoader.loadClass(ClassLoader.java:357) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:264) at loci.formats.ClassList.parseLine(ClassList.java:196) at loci.formats.ClassList.parseFile(ClassList.java:258) at loci.formats.ClassList.(ClassList.java:138) at loci.formats.ClassList.(ClassList.java:122) at loci.formats.ImageReader.getDefaultReaderClasses(ImageReader.java:80) at com.glencoesoftware.bioformats2raw.Converter.getBaseReaderClass(Converter.java:2105) at com.glencoesoftware.bioformats2raw.Converter.convert(Converter.java:538) at com.glencoesoftware.bioformats2raw.Converter.call(Converter.java:514) at com.glencoesoftware.bioformats2raw.Converter.call(Converter.java:109) at picocli.CommandLine.executeUserObject(CommandLine.java:1953) at picocli.CommandLine.access$1300(CommandLine.java:145) at picocli.CommandLine$RunLast.executeUserObjectOfLastSubcommandWithSameParent(CommandLine.java:2352) at picocli.CommandLine$RunLast.handle(CommandLine.java:2346) at picocli.CommandLine$RunLast.handle(CommandLine.java:2311) at picocli.CommandLine$AbstractParseResultHandler.handleParseResult(CommandLine.java:2172) at picocli.CommandLine.parseWithHandlers(CommandLine.java:2550) at picocli.CommandLine.parseWithHandler(CommandLine.java:2485) at picocli.CommandLine.call(CommandLine.java:2761) at com.glencoesoftware.bioformats2raw.Converter.main(Converter.java:2274) 2023-01-18 11:06:04,056 [main] DEBUG loci.formats.ClassList - Could not find loci.formats.in.SlideBook6Reader java.lang.ClassNotFoundException: loci.formats.in.SlideBook6Reader at java.net.URLClassLoader.findClass(URLClassLoader.java:381) at java.lang.ClassLoader.loadClass(ClassLoader.java:424) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331) at java.lang.ClassLoader.loadClass(ClassLoader.java:357) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:264) at loci.formats.ClassList.parseLine(ClassList.java:196) at loci.formats.ClassList.parseFile(ClassList.java:258) at loci.formats.ClassList.(ClassList.java:138) at loci.formats.ClassList.(ClassList.java:122) at loci.formats.ImageReader.getDefaultReaderClasses(ImageReader.java:80) at com.glencoesoftware.bioformats2raw.Converter.getBaseReaderClass(Converter.java:2105) at com.glencoesoftware.bioformats2raw.Converter.convert(Converter.java:538) at com.glencoesoftware.bioformats2raw.Converter.call(Converter.java:514) at com.glencoesoftware.bioformats2raw.Converter.call(Converter.java:109) at picocli.CommandLine.executeUserObject(CommandLine.java:1953) at picocli.CommandLine.access$1300(CommandLine.java:145) at picocli.CommandLine$RunLast.executeUserObjectOfLastSubcommandWithSameParent(CommandLine.java:2352) at picocli.CommandLine$RunLast.handle(CommandLine.java:2346) at picocli.CommandLine$RunLast.handle(CommandLine.java:2311) at picocli.CommandLine$AbstractParseResultHandler.handleParseResult(CommandLine.java:2172) at picocli.CommandLine.parseWithHandlers(CommandLine.java:2550) at picocli.CommandLine.parseWithHandler(CommandLine.java:2485) at picocli.CommandLine.call(CommandLine.java:2761) at com.glencoesoftware.bioformats2raw.Converter.main(Converter.java:2274) 2023-01-18 11:06:04,058 [main] DEBUG loci.formats.ClassList - Could not find loci.formats.in.ScreenReader java.lang.ClassNotFoundException: loci.formats.in.ScreenReader at java.net.URLClassLoader.findClass(URLClassLoader.java:381) at java.lang.ClassLoader.loadClass(ClassLoader.java:424) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331) at java.lang.ClassLoader.loadClass(ClassLoader.java:357) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:264) at loci.formats.ClassList.parseLine(ClassList.java:196) at loci.formats.ClassList.parseFile(ClassList.java:258) at loci.formats.ClassList.(ClassList.java:138) at loci.formats.ClassList.(ClassList.java:122) at loci.formats.ImageReader.getDefaultReaderClasses(ImageReader.java:80) at com.glencoesoftware.bioformats2raw.Converter.getBaseReaderClass(Converter.java:2105) at com.glencoesoftware.bioformats2raw.Converter.convert(Converter.java:538) at com.glencoesoftware.bioformats2raw.Converter.call(Converter.java:514) at com.glencoesoftware.bioformats2raw.Converter.call(Converter.java:109) at picocli.CommandLine.executeUserObject(CommandLine.java:1953) at picocli.CommandLine.access$1300(CommandLine.java:145) at picocli.CommandLine$RunLast.executeUserObjectOfLastSubcommandWithSameParent(CommandLine.java:2352) at picocli.CommandLine$RunLast.handle(CommandLine.java:2346) at picocli.CommandLine$RunLast.handle(CommandLine.java:2311) at picocli.CommandLine$AbstractParseResultHandler.handleParseResult(CommandLine.java:2172) at picocli.CommandLine.parseWithHandlers(CommandLine.java:2550) at picocli.CommandLine.parseWithHandler(CommandLine.java:2485) at picocli.CommandLine.call(CommandLine.java:2761) at com.glencoesoftware.bioformats2raw.Converter.main(Converter.java:2274) 2023-01-18 11:06:04,058 [main] DEBUG loci.formats.ClassList - Could not find loci.formats.in.ZarrReader ... 2023-01-18 12:56:47,519 [pool-1-thread-1] INFO org.perf4j.TimingLogger - start[1674046606753] time[766] tag[getChunk] 2023-01-18 12:56:47,640 [pool-1-thread-1] DEBUG c.g.bioformats2raw.Converter - start[1674046607520] time[120] tag[writeBytes] 2023-01-18 12:56:47,641 [pool-1-thread-1] INFO c.g.bioformats2raw.Converter - Successfully processed chunk; resolution=0 plane=242 xx=9216 yy=3072 zz=60 width=1024 height=1024 depth=1 2023-01-18 12:56:47,688 [pool-1-thread-1] INFO c.g.bioformats2raw.Converter - requesting tile to write at [0, 0, 61, 3072, 9216] to 0/0 2023-01-18 12:56:48,840 [pool-1-thread-1] INFO c.g.bioformats2raw.Converter - chunk read complete 18202/72618 2023-01-18 12:56:48,840 [pool-1-thread-1] INFO org.perf4j.TimingLogger - start[1674046607688] time[1151] tag[getChunk] 2023-01-18 12:56:48,967 [pool-1-thread-1] DEBUG c.g.bioformats2raw.Converter - start[1674046608840] time[126] tag[writeBytes] 2023-01-18 12:56:48,967 [pool-1-thread-1] INFO c.g.bioformats2raw.Converter - Successfully processed chunk; resolution=0 plane=61 xx=9216 yy=3072 zz=61 width=1024 height=1024 depth=1 2023-01-18 12:56:49,022 [pool-1-thread-1] INFO c.g.bioformats2raw.Converter - requesting tile to write at [0, 1, 61, 3072, 9216] to 0/0 2023-01-18 12:56:50,360 [pool-1-thread-1] INFO c.g.bioformats2raw.Converter - chunk read complete 18203/72618 2023-01-18 12:56:50,360 [pool-1-thread-1] INFO org.perf4j.TimingLogger - start[1674046609022] time[1337] tag[getChunk] 2023-01-18 12:56:50,424 [pool-1-thread-1] DEBUG c.g.bioformats2raw.Converter - start[1674046610361] time[63] tag[writeBytes] 2023-01-18 12:56:50,424 [pool-1-thread-1] INFO c.g.bioformats2raw.Converter - Successfully processed chunk; resolution=0 plane=152 xx=9216 yy=3072 zz=61 width=1024 height=1024 depth=1 2023-01-18 12:56:50,458 [pool-1-thread-1] INFO c.g.bioformats2raw.Converter - requesting tile to write at [0, 2, 61, 3072, 9216] to 0/0 ```
will-moore commented 1 year ago

Seb: "from var/log/messages"

(base) [sbesson@pilot-zarr1-dev ~]$ sudo grep Out /var/log/messages
Jan 17 15:24:41 pilot-zarr1-dev kernel: Out of memory: Kill process 28742 (java) score 976 or sacrifice child
Jan 18 12:57:11 pilot-zarr1-dev kernel: Out of memory: Kill process 29765 (java) score 975 or sacrifice child

Dom: see https://docs.openmicroscopy.org/bio-formats/5.9.1/users/comlinetools/index.html#command-line-environment Trying...

(base) [wmoore@pilot-zarr1-dev idr0048]$ export BF_MAX_MEM=10g
(base) [wmoore@pilot-zarr1-dev idr0048]$ bioformats2raw WB.xml 9846152_mem10g.zarr --debug > bf2raw_mem10g.log
melissalinkert commented 1 year ago

I wouldn't expect setting BF_MAX_MEM to do anything here. That is specific to Bio-Formats command line tools like bfconvert and showinf; bioformats2raw does not recognize it. To change the heap size for bioformats2raw, use something like:

$ JAVA_OPTS="-Xmx10g" bioformats2raw ...

I was able to reproduce this problem after a while with 0.6.0, default memory settings, and 1 worker. For anyone else following along, note this is public data: https://downloads.openmicroscopy.org/images/BDV/idr0048/

I don't see anything obvious to suggest that this is fundamentally a bioformats2raw issue; the image dimensions are well within what we've successfully converted from other original data formats. I suspect that fixing this will require some updates to the BDV reader to reduce memory consumption during tile reading.

sbesson commented 1 year ago

I don't see anything obvious to suggest that this is fundamentally a bioformats2raw issue;

Did you mean a bioformats issue as per your comment about the BDV reader?

will-moore commented 1 year ago

Thanks @melissalinkert - trying JAVA_OPTS="-Xmx10g" bioformats2raw WB.xml 9846152_mem10g.zarr --debug > bf2raw_mem10g.log ...

melissalinkert commented 1 year ago

I don't see anything obvious to suggest that this is fundamentally a bioformats2raw issue;

Did you mean a bioformats issue as per your comment about the BDV reader?

Sorry, what I meant is I think BDV reader is the place to investigate first. That means there may not be anything we can do here in the bioformats2raw repo until any BDV fixes are released and the Bio-Formats version number is ready to be bumped.

will-moore commented 1 year ago

Hmmm - looks like that memory boost had no effect:

(base) [wmoore@pilot-zarr1-dev idr0048]$ JAVA_OPTS="-Xmx10g" bioformats2raw WB.xml 9846152_mem10g.zarr --debug > bf2raw_mem10g.log
OpenJDK 64-Bit Server VM warning: You have loaded library /tmp/opencv_openpnp600874666168285863/nu/pattern/opencv/linux/x86_64/libopencv_java342.so which might have disabled stack guard. The VM will try to fix the stack guard now.
It's highly recommended that you fix the library with 'execstack -c <libfile>', or link it with '-z noexecstack'.
Killed
(base) [wmoore@pilot-zarr1-dev idr0048]$ ls 9846152_mem10g.zarr/0/
0
(base) [wmoore@pilot-zarr1-dev idr0048]$ find 9846152_mem10g.zarr/ -type f | wc -l
18203
melissalinkert commented 1 year ago

Closing here, since memory usage issues will need to be addressed in BDV reader (and I can't transfer the issue across organizations).

jburel commented 1 year ago

@will-moore could you open an issue on ome/bioformats describing the problem? so it is captured