nipy / mindboggle

Automated anatomical brain label/shape analysis software (+ website)
http://mindboggle.info
Other
145 stars 54 forks source link

Zernike descriptors are HUGE #25

Closed binarybottle closed 10 years ago

binarybottle commented 11 years ago

Email exchange between Brian Rossa, Arthur Mikhno, and Arno Klein (8/5-7/2013):

Brian Rossa (8/5/2013): The Python results agree completely with the Octave results on my side, but my results do not agree with your results. The problem is that the function D_CV_orig expects "faces" to be 1-indexed -- think MATLAB -- rather than 0-indexed. We can add an "assert" to the code to check for this error mode in the future. The size of the descriptor that you are seeing, 6, appears to be correct insofar as Arthur's code also gives a descriptor of size 6. The descriptor for label22.vtk has some very large values. While this may seem dubious, the Arthur's code gives extremely high values as well. Here are the "right" answers for your two examples according to Arthur and Octave: [ 0.03978874 0.07597287 0.08322411 0.20233902 0.11730369 0.14579667] [ 7.56275148e+03 1.43262240e+08 1.10767079e+06 2.84879089e+10 1.12922387e+08 1.02507341e+10]

Arthur Mikhno (8/7/2013): The length of the descriptors exponentially increases with order. I don't recall the formula for calculating the number of descriptors from order but can look into that. For ex. order 20 yields 121 descriptors while order 35 yields 342.

Generally speaking, values should be < ~1, with values >> 1 a sign of instability in calculating descriptors. Instability could be due to the way the mesh is created or by trying to calculate at an order too high given the resolution/size of the object.

BTW. One way to tell if your descriptor values are reasonable is by reconstructing using zernike moments to see if the estimated object resembles the true object. I have recon. code ready if you want to give that a try.

Arno: thank you for the explanation, arthur. i like the idea of reconstructing the mesh to see how well it estimated the moments. please do share your code for this so we may give it a try.

the surface patches i am running the code on have hundreds to thousands of vertices, and when i run the code with order 10 i get huge descriptor values. do you think that simplifying the mesh would lead to greater stability for the zernike code? is there any theoretical rationale for choosing a given order; for example, would a given mesh size demand a minimum order?

satra commented 11 years ago

i agree that recon is a good way to look at accuracy. in that regard take a look at:

doi:10.1109/TPAMI.2010.139

that paper describes an approximate form which runs much quicker but should be fairly good on dense meshes like the ones arno has.

binarybottle commented 11 years ago

Arthur Mikhno:

This is the algorithm we actually implemented!

binarybottle commented 11 years ago

yes, we would love to try it out. better still, could you first try it on the descriptors from the label22.vtk file i sent to you today?

here are the desciptors that i get for order 3:

>>> # Moments for label 22 (postcentral) in Twins-2-1:
>>> import os
>>> from mindboggle.shapes.zernike.zernike import zernike_moments_per_label
>>> path = os.environ['MINDBOGGLE_DATA']
>>> vtk_file = os.path.join(path, 'arno', 'labels', 'lh.labels.DKT25.manual.vtk')
>>> area_file = os.path.join(path, 'arno', 'shapes', 'lh.pial.area.vtk')
>>> order = 3
>>> exclude_labels = [0]
>>> largest_segment = True
>>> zernike_moments_per_label(vtk_file, order, exclude_labels, area_file,
>>>                           largest_segment)
([[7562.751480397972,
   143262239.5171249,
   1107670.7893994227,
   28487908892.820065,
   112922387.17238183,
   10250734140.30357]],
 [22])
satra commented 11 years ago

great. in that case that reference should be added to the docstring.

binarybottle commented 11 years ago

Arthur Mikhno:

Your descriptors don't look right, the values are way too high.

I loaded this label, it looks like a thin (1 pixel thick?) surface portion of a piece of a cortex. I am not sure my code will work a thin sheet like this, have only seen this applied to volumetric meshes. Can you confirm this? Is there a way to fill in the interior?

I would try filling it in, making it thicker, and running the code again and seeing the values are more reasonable.

binarybottle commented 11 years ago

do you mean that it has to be a closed mesh (like a beach ball) vs. a sheet, or that it needs to be a solid filled with triangles?

binarybottle commented 11 years ago

Arthur Mikhno: A closed mesh or a solid, should give approx same results.

binarybottle commented 11 years ago

i start out with a closed mesh (hemisphere) and extract open patches (gyri, sulci). i will have to think about how to close it or thicken it. any suggestions?​

satra commented 11 years ago

close using white and pial surfaces.

binarybottle commented 11 years ago

do you know of a good way to close a mesh using two surfaces?

satra commented 11 years ago

just connect up the corresponding edge vertices of the patches. so every adjacent pair of edge vertices will have two triangular faces.

binarybottle commented 11 years ago

I have coded up a function called close_surfaces() to connect the corresponding boundary vertices of pial and gray-white surface patches. The zernike_moments_per_label() function now has the option to close each patch's surface and decimate the resulting mesh. Decimation does speed things up considerably, and the pre- and post-decimated zernike descriptors are similar. Unfortunately, the closed surfaces have values that are huge, just as before. One unfortunate aspect to connecting the surface patch boundaries is that the two surfaces are not parallel, so connecting the vertices leads to intersecting faces (see second figure). screen shot 2013-08-24 at 2 20 57 pm screen shot 2013-08-24 at 2 44 49 am

binarybottle commented 11 years ago

Arthur Mikhno (8/27/2013): The objects must be scaled to fit within a unit sphere first. I don't recall if the code you have does this or not.

You must normalize (scale and center) to a unit sphere before running ZK. To do this just shift the object to position 0,0,0 by subtracting the mean x,y,z values from each node. Then scale the object such that the max Euclidean distance between each node and origin is <=1.

binarybottle commented 11 years ago

i have added recentering and rescaling in zernike.py, and now things are looking more reasonable! i'm even getting small values for open surface patches! perhaps i don't need to close them after all? in fact, i'm even getting small values for a vtk file with two separate patches -- does this make sense?

these are the results for each of the example vtk files: All are order=3

# Example 1: vtk_file = 'cube.vtk'
[0.0,
 1.5444366221695725e-21,
 0.0,
 2.081366518964347e-21,
 5.735003646768394e-05,
 2.433866250546253e-21]

# Example 2: vtk_file = 'test_one_surface.vtk'
[0.01566588675857814,
 0.012346130475603009,
 0.018993428582801408,
 0.022470635336483052,
 0.016625931229056885,
 0.014155674468859897]

# Example 3: vtk_file = 'test_two_surfaces.vtk'
[0.017618016083542545,
 0.014899667748084695,
 0.02071650549121894,
 0.025572906059619344,
 0.017558662051283647,
 0.014535574445816983]
binarybottle commented 11 years ago

Note: outputs are consistent across the zernike.py functions.

binarybottle commented 11 years ago

arthur and brian --

could you please generate order 3 descriptors for the following four files for me to compare against?:

  1. cube.vtk (new results below)
  2. test_one_label.vtk (one open surface patch)
  3. test_two_labels.vtk (two open patches)
  4. label14.vtk (one patch)

4 is the only one that breaks our program -- i get a Segmentation Fault 11 error and i have no idea why.

i can open up the file and view it easily enough. could you please help me to figure out why? can your program compute moments from it? here are the new zernike results for the two simple cube examples (all order=3). i still don't know why Example 1 gives zeros only when not decimating:

>>> # Example 1: simple cube -- NO DECIMATION
>>> points = [[0,0,0], [1,0,0], [0,0,1], [0,1,1], [1,0,1], [0,1,0], [1,1,1], [1,1,0]]
>>> faces = [[0,2,4], [0,1,4], [2,3,4], [3,4,5], [3,5,6], [0,1,7]]
[0.0918881492369654,
 0.09357431096617608,
 0.04309029164656885,
 0.06466432586854755,
 0.03820155248327533,
 0.04138011726544602]

>>> # Example 2: simple cube (with inner diagonal plane) -- NO DECIMATION:
>>> vtk_file = '/drop/cube.vtk'
[0.0,
 1.5444366221695725e-21,
 0.0,
 2.081366518964347e-21,
 5.735003646768394e-05,
 2.433866250546253e-21]

>>> decimate_fraction = 0.75  # WITH DECIMATION
>>> decimate_smooth = 25
[0.006991734011035332,
 0.013193003020582572,
 0.004870751616753583,
 0.013680131900964697,
 0.0023524006444438673,
 0.0009947980911615825]
binarybottle commented 10 years ago

Brian Rossa:

The backtrace seems to indicate a VTK bug:

Program received signal SIGSEGV, Segmentation fault. 0x00007ffff45a0731 in vtkCellLinks::BuildLinks(vtkDataSet_) () from /usr/lib/libvtkFiltering.so.5.8 (gdb) bt

0 0x00007ffff45a0731 in vtkCellLinks::BuildLinks(vtkDataSet_) () from /usr/lib/libvtkFiltering.so.5.8

... I couldn't get libvtk to play nice with python-gdb so using print statements I manually traced the problem back to the decimate(...) call at shapes/zernike/zernike.py:198.

binarybottle commented 10 years ago

thank you, brian! the segmentation fault does indeed occur during the decimation step, so i have stopped decimating by default. unfortunately this means that zernike computation will be much slower, but possible!

binarybottle commented 10 years ago

Brian confirmed that the original Matlab code from Arthur generates the Zernike descriptors he reported for the Parallelepiped.vtk file ("I have ground truth for that structure from a Zernike development group. You should get the following results for order = 3 (vtk file attached)"):

Descriptors =

0.0251
0.0310
0.0255
0.0451
0.0189
0.0133
binarybottle commented 10 years ago

Brian Rossa (3/4/2014):

Yes, the original MATLAB code does indeed give the result

Descriptors =

0.0251
0.0310
0.0255
0.0451
0.0189
0.0133

The current regression suite does not check against small values of N, so the onus is on me to upgrade it. Due to the amount of time it takes MATLAB to generate the regression data -- 10+ hours for N=20 -- it will not be possible to perform regression for all values of N. Are there specific values of N that the Mindboggle use-case relies on?

Arno: Mindboggle currently defaults to order = 10, and arthur's and my test code is set to order = 3, so you certainly don't need to worry about every use-case!