seung-lab / igneous

Scalable Neuroglancer compatible Downsampling, Meshing, Skeletonizing, Contrast Normalization, Transfers and more.
GNU General Public License v3.0
43 stars 17 forks source link

Last voxel column gets shifted to the front #153

Closed Lauenburg closed 1 year ago

Lauenburg commented 1 year ago

I am trying to recover the original Numpy dataset from an NG precompute dataset.

For this, I am running through the following steps:

  1. Retrieve and convert an h5 dataset to Numpy
  2. Convert Numpy to NG precompute using igneous
  3. Access NG precompute data using cloud-volume
  4. Assert that the data from steps 1 and 3 are equal

In code:

img_np = np.array(img_h5['main'])
np.save('<source_path>/im_Yx2.npy', img_np)

!igneous image create <source_path>/im_Yx2.npy <target_path> 

img_cv_vol = CloudVolume('file:///<target_path>')
img_cv_subvol = np.squeeze(img_cv_vol[:,:,:])

assert np.array_equal(img_cv_subvol[:,:,:], img_np[:,:,:]) #Fail!

However, the assert fails... Checking the volumes, it seems that the final column of voxels (128X128Xz) of the precompute gets shifted to the front. Plotting part of the volumes confirms this:

fig, ax = plt.subplots(2, 3, figsize=(10, 5))
for i in range(3):
    ax[0,i].imshow(img_cv_subvol[i,:500,:], cmap='gray')
    ax[1,i].imshow(img_np[i,:500,:], cmap='gray')

image

When applying a img_csubvol = np.roll(img_csubvol, -128, axis=2), img_cv_subvol and img_np look the same. But the assert still fails.

img_csubvol= np.roll(img_cv_subvol, -128, axis=2)
fig, ax = plt.subplots(2, 3, figsize=(10, 5))
for i in range(3):
    ax[0,i].imshow(img_cv_subvol[i,:500,:], cmap='gray')
    ax[1,i].imshow(img_np[i,:500,:], cmap='gray')
assert np.array_equal(img_cv_subvol[:,:,:], img_np[:,:,:]) #Fail!

image

Creating a heatmap shows that the last column is not only shifted to the front but also some filter is applied:

img_cv_subvol= np.roll(img_cv_subvol, -128, axis=2)
fig, ax = plt.subplots(1, 3, figsize=(10, 5))
for i in range(3):
    img_plt = ax[i].imshow(img_cv_subvol[i,:500,:]-img_np[i,:500,:], cmap='hot', interpolation='nearest')

image

For the assert to succeed, you have to cut off the first column from the recovered and the last column from the original dataset:

assert np.array_equal(img_cv_subvol[:,:,128:], img_np[:,:,:-128]) #Success!

Could someone tell me what is going wrong here and how I can prevent it?

jakobtroidl commented 1 year ago

@william-silversmith I have the same issue. Do you have any ideas why this could happen? This is how we convert the NumPy array into a precomputed file.

igneous image create <source> <target> --resolution 8,8,30  --chunk-size 50,50,5 --compress none
william-silversmith commented 1 year ago

Ah, I think I know what's going on here. I was trying to be clever here and am reading .npy files as np.memmap so that way you can easily work with very large npy file. However, there are two functions you can use. np.memmap treats the file as an array and so the header gets treated as data and probably causes the weird shifts you are seeing. Another function np.lib.format.open_memmap handles npy files with headers. I'll check to see if the file has a header and then use the appropriate function.

william-silversmith commented 1 year ago

Hi, I updated the create function and it should work a bit better now in 4.19.0. There is also support for hdf5 and crackle files. You'll need to manually install h5py. Give it a try and let me know how it goes.

william-silversmith commented 1 year ago

Ack, I boofed it. I'll release a fixed version in a couple hours.

william-silversmith commented 1 year ago

Check out the latest version!

Lauenburg commented 1 year ago

Hi @william-silversmith, Now it works like a charm! Thank you so much for all your work!