paulscherrerinstitute / ch.psi.imagej.hdf5

ImageJ HDF5 Plugin
13 stars 5 forks source link

hdf.hdf5lib.exceptions.HDF5Exception: Invalid int size #8

Closed Pescatore23 closed 1 year ago

Pescatore23 commented 1 year ago

Hi,

I convert the reconstructed tiff-Stacks to a netcdf4-file containing the image data as 4D-uint16-array using python with xarray and numpy. Trying to import it with hdf5 vibez into ImageJ, I get a "Length is too large" error (roughly 13 billion). It works for smaller dataset.

However, with the psi-hdf5-loader on ra in the "TOMCAT"-Fiji and locally I get following error "Invalid int size":

Dec 02, 2022 10:57:04 AM ch.psi.imagej.hdf5.HDF5Reader open WARNING: Error while opening: /das/home/fische_r/DASCOELY/Data10/disk1/T_3III_scan_04_test.nc java.lang.Exception: Failed to read scalar dataset: Invalid int size at hdf.object.h5.H5ScalarDS.read(H5ScalarDS.java:820) at ch.psi.imagej.hdf5.HDF5Reader.open(HDF5Reader.java:221) at ch.psi.imagej.hdf5.HDF5Reader.open(HDF5Reader.java:51) at ch.psi.imagej.hdf5.HDF5Reader.run(HDF5Reader.java:36) at ij.IJ.runUserPlugIn(IJ.java:241) at ij.IJ.runPlugIn(IJ.java:204) at ij.Executer.runCommand(Executer.java:151) at ij.Executer.run(Executer.java:69) at java.lang.Thread.run(Thread.java:748) Caused by: hdf.hdf5lib.exceptions.HDF5Exception: Invalid int size at hdf.object.h5.H5Utils.getTotalSelectedSpacePoints(H5Utils.java:118) at hdf.object.h5.H5ScalarDS.scalarDatasetCommonIO(H5ScalarDS.java:902) at hdf.object.h5.H5ScalarDS.read(H5ScalarDS.java:816) ... 8 more

Is this somethin you have encountered before and have a workaround? I will try to play around with the dataformat and datasize to find the issue and a solution.

Best, Robert robert.fischer@psi.ch

rcatwood commented 1 year ago

I can import the file created by the following Python script: (3.7, Anaconda )

200,200,200,200 4D array, uint16 data type, just containing ‘ones’ (script code below)

It does not import in the ‘default plugin’ as it’s too big, but it does import into the PSI plugin. Maybe you accidentally created some special data type, can you make the data into ‘np.uint16’ before writing in an HDF5 ? Is the PSI plugin expected to read netcdf4 files at all? I don’t see mention of such files in the documentation.

But …

it does not work when the memory is restricted in Imagej even if ‘virtual stack’ is ticked, should I expect this to work with 4D?

it fails with memory error for a bigger 4d array (300,300,300,300) with ‘null pointer exception’

However , though our software team has made reconstructed time-series tomography output as 4d HDF5 datasets, I don’t really like it :/ -- I’d rather have a series of 3-d datasets. (what do you do at Tomcat ? )

1 import numpy as np 2 import h5py 3 4 aaa=np.ones((200,200,200,200),dtype=np.uint16) 5 fff=h5py.File("fourdee_med.h5","w") 6 fff.create_dataset("/entry/data",data=aaa ) 7 fff.close() 8

From: Pescatore23 @.> Sent: 02 December 2022 10:12 To: paulscherrerinstitute/ch.psi.imagej.hdf5 @.> Cc: Subscribed @.***> Subject: [paulscherrerinstitute/ch.psi.imagej.hdf5] hdf.hdf5lib.exceptions.HDF5Exception: Invalid int size (Issue #8)

Hi,

I convert the reconstructed tiff-Stacks to a netcdf4-file containing the image data as 4D-uint16-array using python with xarray and numpy. Trying to import it with hdf5 vibez into ImageJ, I get a "Length is too large" error (roughly 13 billion). It works for smaller dataset.

However, with the psi-hdf5-loader on ra in the "TOMCAT"-Fiji and locally I get following error "Invalid int size":

Dec 02, 2022 10:57:04 AM ch.psi.imagej.hdf5.HDF5Reader open WARNING: Error while opening: /das/home/fische_r/DASCOELY/Data10/disk1/T_3III_scan_04_test.nc java.lang.Exception: Failed to read scalar dataset: Invalid int size at hdf.object.h5.H5ScalarDS.read(H5ScalarDS.java:820) at ch.psi.imagej.hdf5.HDF5Reader.open(HDF5Reader.java:221) at ch.psi.imagej.hdf5.HDF5Reader.open(HDF5Reader.java:51) at ch.psi.imagej.hdf5.HDF5Reader.run(HDF5Reader.java:36) at ij.IJ.runUserPlugIn(IJ.java:241) at ij.IJ.runPlugIn(IJ.java:204) at ij.Executer.runCommand(Executer.java:151) at ij.Executer.run(Executer.java:69) at java.lang.Thread.run(Thread.java:748) Caused by: hdf.hdf5lib.exceptions.HDF5Exception: Invalid int size at hdf.object.h5.H5Utils.getTotalSelectedSpacePoints(H5Utils.java:118) at hdf.object.h5.H5ScalarDS.scalarDatasetCommonIO(H5ScalarDS.java:902) at hdf.object.h5.H5ScalarDS.read(H5ScalarDS.java:816) ... 8 more

Is this somethin you have encountered before and have a workaround? I will try to play around with the dataformat and datasize to find the issue and a solution.

Best, Robert @.**@.>

— Reply to this email directly, view it on GitHubhttps://github.com/paulscherrerinstitute/ch.psi.imagej.hdf5/issues/8, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AC4TESVLQNPOUCLKX6P3AATWLHDOFANCNFSM6AAAAAASRX4LUM. You are receiving this because you are subscribed to this thread.Message ID: @.**@.>>

-- This e-mail and any attachments may contain confidential, copyright and or privileged material, and are for the use of the intended addressee only. If you are not the intended addressee or an authorised recipient of the addressee please notify us of receipt by returning the e-mail and do not use, copy, retain, distribute or disclose the information in or attached to the e-mail. Any opinions expressed within this e-mail are those of the individual and not necessarily of Diamond Light Source Ltd. Diamond Light Source Ltd. cannot guarantee that this e-mail or any attachments are free from viruses and we cannot accept liability for any damage which you may sustain as a result of software viruses which may be transmitted in or with the message. Diamond Light Source Limited (company no. 4375679). Registered in England and Wales with its registered office at Diamond House, Harwell Science and Innovation Campus, Didcot, Oxfordshire, OX11 0DE, United Kingdom

Pescatore23 commented 1 year ago

Many thanks. I try to create a "normal" h5 using h5py instead of xarray just the way you did it. It is already explicitly np.uint16. I just like the netcdf4 structure, which bases somewhat on hdf5 (I am not an expert) ,and its usage in xarray and dask. Often .h5 and .nc behave very similarly, but I will simply use what works

Good to know that the PSI-plugin can handle large files and is only limited by memory.

I was just a recent user and I am not involved in tomcat itself. Having a 4D-dataset is a nice asset for downstream processing and analysis.

rcatwood commented 1 year ago

In testing this morning , just because of seeing your bug, I do find that it failed on ‘large’ (only 60 Gigabyte, 300^4 uint16 ) 4d file (Null pointer exception) , though I’ve used it for very large 3d files ( on our visualization workstation system with 1.5 Tb RAM ) – usually ‘virtual stack’ and then selecting a target region followed by ‘duplicate’ , even if the target region is most of the data , this seems to work smoother in Fiji . How big is your 4d array??

I observe that it is unexpectedly slow to load the 4d file of 200 ^ 4 , perhaps that’s to do with Fiji trying to reorient the whole array depending on the (xyzt) selection , I have not tried permuting that orientation string so far .

But, my two problems with 4-d datsets are : 90% of users can’t handle it (easy available tools don’t handle them) – could be overcome by this plugin if 4d support is there 😃 ??

But – it is often very slow to open them, I think depending on some rather low-level details of memory layout, chunking , chunk cache parameters, and expected access use. When trying to get just a single frame or a series of slices to view , from a tomography time-series of about 50 tomography of about 2k cubes, (~terabyte size datafile) just opening the file seems to take 10s of minutes , I expect due to the large size of the chunk data tree structure and possibly such access-time attributes as the chunk cache parameters. While I would have though ‘chunking’ would allow access to the selected part of the large data more quickly, it seems to be rather more complicated than that.

From: Pescatore23 @.> Sent: 02 December 2022 11:31 To: paulscherrerinstitute/ch.psi.imagej.hdf5 @.> Cc: Atwood, Robert (DLSLtd,RAL,SCI) @.>; Comment @.> Subject: Re: [paulscherrerinstitute/ch.psi.imagej.hdf5] hdf.hdf5lib.exceptions.HDF5Exception: Invalid int size (Issue #8)

Many thanks. I try to create a "normal" h5 using h5py instead of xarray just the way you did it. It is already explicitly np.uint16. I just like the netcdf4 structure, which bases somewhat on hdf5 (I am not an expert) ,and its usage in xarray and dask. Often .h5 and .nc behave very similarly, but I will simply use what works

Good to know that the PSI-plugin can handle large files and is only limited by memory.

I was just a recent user and I am not involved in tomcat itself. Having a 4D-dataset is a nice asset for downstream processing and analysis.

— Reply to this email directly, view it on GitHubhttps://github.com/paulscherrerinstitute/ch.psi.imagej.hdf5/issues/8#issuecomment-1335109392, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AC4TESSJOUBSWF4XZ6RNGUTWLHMXXANCNFSM6AAAAAASRX4LUM. You are receiving this because you commented.Message ID: @.**@.>>

-- This e-mail and any attachments may contain confidential, copyright and or privileged material, and are for the use of the intended addressee only. If you are not the intended addressee or an authorised recipient of the addressee please notify us of receipt by returning the e-mail and do not use, copy, retain, distribute or disclose the information in or attached to the e-mail. Any opinions expressed within this e-mail are those of the individual and not necessarily of Diamond Light Source Ltd. Diamond Light Source Ltd. cannot guarantee that this e-mail or any attachments are free from viruses and we cannot accept liability for any damage which you may sustain as a result of software viruses which may be transmitted in or with the message. Diamond Light Source Limited (company no. 4375679). Registered in England and Wales with its registered office at Diamond House, Harwell Science and Innovation Campus, Didcot, Oxfordshire, OX11 0DE, United Kingdom

Pescatore23 commented 1 year ago

Yes, I can confirm that it works well for limited size (100,100,100,10). A larger dataset prompts a similar error (now with a h5 file from h5py):

Dec 02, 2022 12:48:00 PM ch.psi.imagej.hdf5.HDF5Reader open INFO: Using manual selection Dec 02, 2022 12:48:02 PM ch.psi.imagej.hdf5.HDF5Reader open INFO: Reading dataset: data Dimensions: 4 Type: 16-bit unsigned integer Dec 02, 2022 12:48:02 PM ch.psi.imagej.hdf5.HDF5Reader open INFO: 4D Image (HyperVolume) Dec 02, 2022 12:50:21 PM ch.psi.imagej.hdf5.HDF5Reader open INFO: Using manual selection Dec 02, 2022 12:50:24 PM ch.psi.imagej.hdf5.HDF5Reader open INFO: Reading dataset: data Dimensions: 4 Type: 16-bit unsigned integer Dec 02, 2022 12:50:24 PM ch.psi.imagej.hdf5.HDF5Reader open INFO: 4D Image (HyperVolume) Dec 02, 2022 12:50:24 PM ch.psi.imagej.hdf5.HDF5Reader open WARNING: Error while opening: /mpc/homes/fische_r/NAS/test2.h5 java.lang.Exception: Failed to read scalar dataset: Invalid int size at hdf.object.h5.H5ScalarDS.read(H5ScalarDS.java:820) at ch.psi.imagej.hdf5.HDF5Reader.open(HDF5Reader.java:221) at ch.psi.imagej.hdf5.HDF5Reader.open(HDF5Reader.java:51) at ch.psi.imagej.hdf5.HDF5Reader.run(HDF5Reader.java:36) at ij.IJ.runUserPlugIn(IJ.java:237) at ij.IJ.runPlugIn(IJ.java:203) at ij.Executer.runCommand(Executer.java:152) at ij.Executer.run(Executer.java:70) at java.lang.Thread.run(Thread.java:750) Caused by: hdf.hdf5lib.exceptions.HDF5Exception: Invalid int size at hdf.object.h5.H5Utils.getTotalSelectedSpacePoints(H5Utils.java:118) at hdf.object.h5.H5ScalarDS.scalarDatasetCommonIO(H5ScalarDS.java:902) at hdf.object.h5.H5ScalarDS.read(H5ScalarDS.java:816) ... 8 more

I am testing with one of my "smaller" samples: (867, 849, 2016, 9) which is about 27 GB, max. number of time steps are 50. This a nice size to handle in python, even better with appropiate chunking. I had the idea to load a hyperstack to do some preliminary inspection in Fiji allowing me to slide in time and one spatial direction, I think this would be very useful for users. But maybe Fiji with h5/nc is not the right tool for that and using macros to create a Hyperstack from the original reconstruction output is the better way. No idea how efficiently this would work. Thanks for your help and time.

Pescatore23 commented 1 year ago

short comment: importing a small sized netcfd4 produced with xarray works just fine. So we have narrowed down the issue :)

rcatwood commented 1 year ago

Another reproducer – if using a 3d file of (2000,2000,2000) float32 –

So it’s not actually to do with 4d (but ‘virtual stack’ appears to have no effect with a 4d dataset .. perhaps never implemented?) And it’s not actually to do with uint16 data type.

Since I’ve “virtually” always used virtual stack on 3d datsets of this size , [ a useful and essential feature of this plugin ], I didn’t observe this problem before.

From: Pescatore23 @.> Sent: 02 December 2022 12:17 To: paulscherrerinstitute/ch.psi.imagej.hdf5 @.> Cc: Atwood, Robert (DLSLtd,RAL,SCI) @.>; Comment @.> Subject: Re: [paulscherrerinstitute/ch.psi.imagej.hdf5] hdf.hdf5lib.exceptions.HDF5Exception: Invalid int size (Issue #8)

short comment: importing a small sized netcfd4 produced with xarray works just fine. So we have narrowed down the issue :)

— Reply to this email directly, view it on GitHubhttps://github.com/paulscherrerinstitute/ch.psi.imagej.hdf5/issues/8#issuecomment-1335153078, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AC4TESTN6WBXP2B7IMPCHTLWLHSDRANCNFSM6AAAAAASRX4LUM. You are receiving this because you commented.Message ID: @.**@.>>

-- This e-mail and any attachments may contain confidential, copyright and or privileged material, and are for the use of the intended addressee only. If you are not the intended addressee or an authorised recipient of the addressee please notify us of receipt by returning the e-mail and do not use, copy, retain, distribute or disclose the information in or attached to the e-mail. Any opinions expressed within this e-mail are those of the individual and not necessarily of Diamond Light Source Ltd. Diamond Light Source Ltd. cannot guarantee that this e-mail or any attachments are free from viruses and we cannot accept liability for any damage which you may sustain as a result of software viruses which may be transmitted in or with the message. Diamond Light Source Limited (company no. 4375679). Registered in England and Wales with its registered office at Diamond House, Harwell Science and Innovation Campus, Didcot, Oxfordshire, OX11 0DE, United Kingdom

rcatwood commented 1 year ago

I can confirm that – after I updated everything today to latest version – new FIJI installation and updated ImageJ 2.9.0/1.53v , PSI plugin downloaded from 0.13.0 GitHub, I get ‘invalid int size’ exception (instead of ‘null pointer’ that I got before updating) when opening (300,300,300,300) uint16 (but not with 200^4 size) On ‘Red Hat EL’ Linux64 with plenty of memory

From: Pescatore23 @.> Sent: 02 December 2022 12:08 To: paulscherrerinstitute/ch.psi.imagej.hdf5 @.> Cc: Atwood, Robert (DLSLtd,RAL,SCI) @.>; Comment @.> Subject: Re: [paulscherrerinstitute/ch.psi.imagej.hdf5] hdf.hdf5lib.exceptions.HDF5Exception: Invalid int size (Issue #8)

Yes, I can confirm that it works well for limited size (100,100,100,10). A larger dataset prompts a similar error (now with a h5 file from h5py):

Dec 02, 2022 12:48:00 PM ch.psi.imagej.hdf5.HDF5Reader open INFO: Using manual selection Dec 02, 2022 12:48:02 PM ch.psi.imagej.hdf5.HDF5Reader open INFO: Reading dataset: data Dimensions: 4 Type: 16-bit unsigned integer Dec 02, 2022 12:48:02 PM ch.psi.imagej.hdf5.HDF5Reader open INFO: 4D Image (HyperVolume) Dec 02, 2022 12:50:21 PM ch.psi.imagej.hdf5.HDF5Reader open INFO: Using manual selection Dec 02, 2022 12:50:24 PM ch.psi.imagej.hdf5.HDF5Reader open INFO: Reading dataset: data Dimensions: 4 Type: 16-bit unsigned integer Dec 02, 2022 12:50:24 PM ch.psi.imagej.hdf5.HDF5Reader open INFO: 4D Image (HyperVolume) Dec 02, 2022 12:50:24 PM ch.psi.imagej.hdf5.HDF5Reader open WARNING: Error while opening: /mpc/homes/fische_r/NAS/test2.h5 java.lang.Exception: Failed to read scalar dataset: Invalid int size at hdf.object.h5.H5ScalarDS.read(H5ScalarDS.java:820) at ch.psi.imagej.hdf5.HDF5Reader.open(HDF5Reader.java:221) at ch.psi.imagej.hdf5.HDF5Reader.open(HDF5Reader.java:51) at ch.psi.imagej.hdf5.HDF5Reader.run(HDF5Reader.java:36) at ij.IJ.runUserPlugIn(IJ.java:237) at ij.IJ.runPlugIn(IJ.java:203) at ij.Executer.runCommand(Executer.java:152) at ij.Executer.run(Executer.java:70) at java.lang.Thread.run(Thread.java:750) Caused by: hdf.hdf5lib.exceptions.HDF5Exception: Invalid int size at hdf.object.h5.H5Utils.getTotalSelectedSpacePoints(H5Utils.java:118) at hdf.object.h5.H5ScalarDS.scalarDatasetCommonIO(H5ScalarDS.java:902) at hdf.object.h5.H5ScalarDS.read(H5ScalarDS.java:816) ... 8 more

I am testing with one of my "smaller" samples: (867, 849, 2016, 9) which is about 27 GB, max. number of time steps are 50. This a nice size to handle in python, even better with appropiate chunking. I had the idea to load a hyperstack to do some preliminary inspection in Fiji allowing me to slide in time and one spatial direction, I think this would be very useful for users. But maybe Fiji with h5/nc is not the right tool for that and using macros to create a Hyperstack from the original reconstruction output is the better way. No idea how efficiently this would work. Thanks for your help and time.

— Reply to this email directly, view it on GitHubhttps://github.com/paulscherrerinstitute/ch.psi.imagej.hdf5/issues/8#issuecomment-1335144267, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AC4TESVMPTIIPND7QS55TDLWLHRCFANCNFSM6AAAAAASRX4LUM. You are receiving this because you commented.Message ID: @.**@.>>

-- This e-mail and any attachments may contain confidential, copyright and or privileged material, and are for the use of the intended addressee only. If you are not the intended addressee or an authorised recipient of the addressee please notify us of receipt by returning the e-mail and do not use, copy, retain, distribute or disclose the information in or attached to the e-mail. Any opinions expressed within this e-mail are those of the individual and not necessarily of Diamond Light Source Ltd. Diamond Light Source Ltd. cannot guarantee that this e-mail or any attachments are free from viruses and we cannot accept liability for any damage which you may sustain as a result of software viruses which may be transmitted in or with the message. Diamond Light Source Limited (company no. 4375679). Registered in England and Wales with its registered office at Diamond House, Harwell Science and Innovation Campus, Didcot, Oxfordshire, OX11 0DE, United Kingdom