seung-lab / cloud-volume

Read and write Neuroglancer datasets programmatically.
https://twitter.com/thundercloudvol
BSD 3-Clause "New" or "Revised" License
127 stars 42 forks source link

mesh.get: vertex positions garbled #450

Closed schlegelp closed 3 years ago

schlegelp commented 3 years ago

Hi Will!

I'm running into a strange error with the mesh download but I'm not sure it's an issue with cloudvolume directly.

I get different results when running the following code on my local machine (OSX) versus on DeepNote (that's a Jupyter notebook server running Linux):

# Download a random mesh from the Janelia hemibrain dataset 
vol = CloudVolume('precomputed://gs://neuroglancer-janelia-flyem-hemibrain/v1.1/segmentation', use_https=True)

mesh = vol.mesh.get(295478082, lod=3)[295478082] 

On my local machine the vertex positions are correct:

>>> mesh.vertices
array([[ 67711.03 , 111167.695, 131136.   ],
       [ 67711.03 , 111039.695, 131136.   ],
       [ 67647.03 , 111231.695, 131136.   ],
       ...,
       [ 66111.01 , 131136.   , 139008.12 ],
       [ 65855.   , 131136.   , 138752.11 ],
       [ 65983.01 , 131136.   , 138752.11 ]], dtype=float32)

On DeepNote they come out as nonsense:

>>> mesh.vertices
>>> array([[6.40000e+01, 6.40000e+01, 6.40000e+01],
           [6.40000e+01, 6.40000e+01, 6.40000e+01],
           [6.40000e+01, 6.40000e+01, 6.40000e+01],
           ...,
           [6.40000e+01, 1.31136e+05, 1.31136e+05],
           [6.40000e+01, 1.31136e+05, 1.31136e+05],
           [6.40000e+01, 1.31136e+05, 1.31136e+05]], dtype=float32)

On both, my local machine and DeepNote, I run cloudvolume 3.8.0 and DracoPy 0.0.15. I'm aware that this will be rather tricky to debug but maybe you have a good idea?

william-silversmith commented 3 years ago

This seems to pertain to the mutli-lod logic. @perlman would you have any insights? I'm planning on working more with the multi-lod format soon, but I haven't quite gotten there yet. Thanks!

william-silversmith commented 3 years ago

Also, @schlegelp can you confirm whether or not the defect occurs when you run it outside of the notebook? Thanks!

nkemnitz commented 3 years ago

Works on Windows + Python 3.8.8. Fails on the same machine with WSL (Ubuntu 20 + Python 3.8.5). Also fails on all other Linux workstations I checked.

william-silversmith commented 3 years ago

I might have said that because DracoPy was compiled at many different times, it's possible one of the wheels is screwy, but Nico already tested that hypothesis. We'll need to identify at which line things start screwing up.

https://pypi.org/project/DracoPy/#files

schlegelp commented 3 years ago

Thanks for being so quick to respond!

I can confirm the issue also occurs from the DeepNote interactive console. I also tried all DracoPy versions from 0.0.15 down to 0.0.10 and with Python 3.7, 3.8 and 3.9 (side note: DeepNote is running Debian 10) - all produce the same garbled vertex positions.

Any chance this is could be cause by another dependency?

perlman commented 3 years ago

I won't have time to dig into this until later this week.

While it's not likely, I could see a change in numpy breaking some aspect of my logic for reading the fragments.

Are the resulting objects the right length, just bogus coordinates?

schlegelp commented 3 years ago

Are the resulting objects the right length, just bogus coordinates?

Yes: vertex array has the correct shape just bogus coordinates.

perlman commented 3 years ago

@schlegelp Would have a bit of time to poke around cloud-volume? It's helpful that you have both a functional & dysfunctional environment.

First, I'd check to see if frag_binary matches: https://github.com/seung-lab/cloud-volume/blob/master/cloudvolume/datasource/precomputed/mesh/multilod.py#L128

Next, I'd check if the mesh returned (mesh on the same line). If that matches, then it's probably an issue with the next bit of code moving the vertices into global space or the concatenation.

schlegelp commented 3 years ago

Sure! frag_binary are identical. It looks like there might an issue with precision of the data returned by DracoPy?

Running this for the first fragment of seg ID 295478082:

frag_binary = lod_binary[
            int(np.sum(manifest.fragment_offsets[lod][0:frag])) :
            int(np.sum(manifest.fragment_offsets[lod][0:frag+1]))
            ]

mesh = Mesh.from_draco(frag_binary)
print("After decoding:\n", mesh.vertices)

mesh.vertices = mesh.vertices.view(dtype=np.int32)
print("After int32 view:\n", mesh.vertices)

# Convert from "stored model" space to "model" space
mesh.vertices = manifest.grid_origin + manifest.vertex_offsets[lod] + \
        manifest.chunk_shape * (2 ** lod) * \
        (manifest.fragment_positions[lod][:,frag] + \
        (mesh.vertices / (2.0 ** vol.mesh.vertex_quantization_bits - 1)))
print("In model space:\n", mesh.vertices)

Output on my local machine:

After decoding:
 [[4.7396e-41 7.7844e-41 9.1834e-41]
 [4.7396e-41 7.7754e-41 9.1834e-41]
 [4.7351e-41 7.7888e-41 9.1834e-41]
 ...
 [3.0403e-41 8.2552e-41 8.5825e-41]
 [3.0403e-41 8.2328e-41 8.5601e-41]
 [3.0941e-41 8.2731e-41 8.5556e-41]]
After int32 view:
 [[33823 55551 65535]
 [33823 55487 65535]
 [33791 55583 65535]
 ...
 [21696 58911 61247]
 [21696 58751 61087]
 [22080 59039 61055]]
In model space:
 [[4231.93951324 6947.98095674 8196.        ]
 [4231.93951324 6939.98083467 8196.        ]
 [4227.9394522  6951.98101778 8196.        ]
 ...
 [2716.04138247 7367.98736553 7659.99182116]
 [2716.04138247 7347.98706035 7639.99151598]
 [2764.0421149  7383.98760967 7635.99145495]]

On DeepNote:

After decoding:
 [[0. 0. 0.]
 [0. 0. 0.]
 [0. 0. 0.]
 ...
 [0. 0. 0.]
 [0. 0. 0.]
 [0. 0. 0.]]
After int32 view:
 [[0 0 0]
 [0 0 0]
 [0 0 0]
 ...
 [0 0 0]
 [0 0 0]
 [0 0 0]]
In model space:
 [[4. 4. 4.]
 [4. 4. 4.]
 [4. 4. 4.]
 ...
 [4. 4. 4.]
 [4. 4. 4.]
 [4. 4. 4.]]
william-silversmith commented 3 years ago

@manuel-castro looks like there could be a bug in DracoPy?

schlegelp commented 3 years ago

Just narrowing it further down to DracoPy.

On my local setup:

>>> import DracoPy 
>>> mesh_object = DracoPy.decode_buffer_to_mesh(frag_binary)
>>> mesh_object.points[:10]
[4.739611795885829e-41,
 7.784353099170791e-41,
 9.183409485952689e-41,
 4.739611795885829e-41,
 7.775384788999112e-41,
 9.183409485952689e-41,
 4.735127640799989e-41,
 7.788837254256631e-41,
 9.183409485952689e-41,
 4.73064348571415e-41]

Same code on DeepNote returns

>>> mesh_object.points[:10]
[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]
manuel-castro commented 3 years ago

@schlegelp Could you please try installing DracoPy version 0.0.18 and trying again? I believe I've fixed the issue. I've uploaded the new source and wheels for linux. Wheels for OS X won't be available for a few hours probably (Travis is slow).

However, interestingly, testing on that fragment you were testing, I get different results from the results you got before.

Here's what I get:

After decoding:
array([[33823., 55487., 65535.],
       [33791., 55583., 65535.],
       [33823., 55551., 65535.],
       ...,
       [30880., 65023., 40511.],
       [30848., 65087., 40287.],
       [30624., 65023., 40895.]], dtype=float32)
After int32 view:
array([[1191452416, 1196998400, 1199570688],
       [1191444224, 1197022976, 1199570688],
       [1191452416, 1197014784, 1199570688],
       ...,
       [1190215680, 1199439616, 1193164544],
       [1190199296, 1199456000, 1193107200],
       [1190084608, 1199439616, 1193262848]], dtype=int32)

In model space:
array([[1.48933829e+08, 1.49627087e+08, 1.49948628e+08],
       [1.48932805e+08, 1.49630159e+08, 1.49948628e+08],
       [1.48933829e+08, 1.49629135e+08, 1.49948628e+08],
       ...,
       [1.48779234e+08, 1.49932244e+08, 1.49147848e+08],
       [1.48777186e+08, 1.49934292e+08, 1.49140680e+08],
       [1.48762850e+08, 1.49932244e+08, 1.49160136e+08]])

I've verified that this is correct by encoding frag_binary to a file and then running Google's draco decoder on it -- this matches the "after decoding" result from DracoPy 0.0.18. Therefore I believe that the Windows/OS X decoded vertices for 0.0.15 were also wrong. Not sure why exactly they match after viewing as np.int32 the 0.0.18 after decoding. The 0.0.18 vertices in model space however seem too large -- maybe mesh.vertices = mesh.vertices.view(dtype=np.int32) is not necessary? Not sure how exactly these meshes were created.

manuel-castro commented 3 years ago

There was a separate issue with DracoPy 0.0.18 affecting the integrity of the mesh faces. Now fixed in 0.0.19, so please use that version instead.

schlegelp commented 3 years ago

Hi! Sorry for the delay. I actually have some issues getting cloudvolume to work on Deepnote again but that seems to be a problem with some Google libraries.

However, there still is something awry with DracoPy. See this example using the hemibrain meshes.

>>> import cloudvolume as cv 
>>> vol = cv.CloudVolume('precomputed://gs://neuroglancer-janelia-flyem-hemibrain/v1.2/segmentation', use_https=True)
>>> mesh = vol.mesh.get(204962969, lod=2)[204962969]

With DracoPy==0.0.15

Screenshot 2021-04-02 at 14 22 54

With DracoPy==0.0.19

Screenshot 2021-04-02 at 14 24 24

This is on my Laptop (OSX 10.14.6), Python 3.7.5 and cloudvolume 3.9.0

manuel-castro commented 3 years ago

Thanks for the report. I'll look into this new issue.

manuel-castro commented 3 years ago

Hey, sorry for the delay. I've looked into this and I can confirm my results from March 22nd. DracoPy 0.0.19 is performing correctly here. I confirmed this by passing the binary mesh data into Google's official draco decoder, and getting the same result as passing it through DracoPy.

I am not familiar with the multilod code so I am not sure why it worked with DracoPy 0.0.15. I did notice however that if I comment out line 133 mesh.vertices = mesh.vertices.view(dtype=np.int32) from multilod.py I get a much more sensible looking result. I recall something about Jeremy using his own Draco format for Neuroglancer and this line seems to relate to that -- maybe this is the reason for the incongruity, as DracoPy does not use a custom Draco format.

william-silversmith commented 3 years ago

Iirc Jeremy's draco modifications had something to do with integer-only calculations.

manuel-castro commented 3 years ago

@schlegelp When you get a chance, could you please check whether CloudVolume 3.11.0 fixes the issue on your end?

schlegelp commented 3 years ago

Hi @manuel-castro. 3.11.0 fixes the issue and meshes now look as expected on both my local machine and on DeepNote. Thanks for all the detective work!