voxelmorph / voxelmorph

Unsupervised Learning for Image Registration
Apache License 2.0
2.28k stars 581 forks source link

questions about the overall registration #25

Closed argman closed 5 years ago

argman commented 5 years ago

Hi, I have two question of the architecture:

  1. what is "squaring and scaling" ? is this a common operation in image registration ?
  2. why do you use a variational encoder in the registration ?
balakg commented 5 years ago
  1. Scaling and squaring is just a way to efficiently compute the integration of the stationary velocity field. A more naive solution would be to integrate the field with regular small time steps, but this would involve too many steps and be slow. See "A fast diffeomorphic image registration algorithm" by Ashburner for more details.

  2. We use a variational method to capture the distribution of registration fields for a given pair. Registration has some inherent uncertainty, so this approach lets us model that.

adalca commented 5 years ago

I would just add that we have other alternatives to scaling and squaring implemented -- see integrate_vec in https://github.com/adalca/neuron/blob/master/neuron/utils.py . However, we've found scaling and squaring to be significantly faster while preserving accuracy.

argman commented 5 years ago

@balakg

  1. the variational model is used to model the uncertainty ? why don't you directly predict Z from the neural network
  2. in section 2.1, i dont know why you suddenly introduce L=D-A, whats the point here ?
balakg commented 5 years ago
  1. The stationary velocity field is "z" in our particular model. But unlike the non-probablistic voxelmorph, making the field a stochastic variable itself allows us to capture its distribution and sample from it.

  2. We use the graph laplacian (L = D-A) formulation to put a prior on z, where neighboring voxels have similar velocities.

argman commented 5 years ago

@balakg, do you know any other loss function can be used for multi modal image registration ? in voxelmorph, cc is used , but seems it cannot be used for multi modal images.

adalca commented 5 years ago

@argman , I see a couple of options:

argman commented 5 years ago

@adalca , the second option maybe too restricted by data, i think we may have several options to avoid the metric problem.

  1. convert fixed and moving image to some uniform representation, but i dont know any uniform representation can solve this
  2. predict displacement field and use MSE as loss, this should use other traditional method to compute the displacement field , so our model's best performance is to approximate the method.

If you have continuous approximation of MI, I'm glad to try if you share, will you update that in this git repo ? seems its the only option to solve multi modal registration

argman commented 5 years ago

@adalca @balakg Hi,I find an approximate solution to compute mutual information in tensorflow, but I cannot test its rightness as I donot have enough data to train the model, can anyone help ? code is here:

def nmi_gaussian(R, T, win=20, eps=1e-5):
    '''
        Parzen window approximation of mutual information
    Params:
        R : Reference(Fixed) Image, shape should be N * H * W * Z * 1
        T: Test (Moving) Image, shape should be the same as R
        win: number of bins used in histogram counting
    '''
    N, H, W, Z, C = R.shape
    assert C == 1, 'image should be only one channel'
    im_size = N.value * H.value * W.value * Z.value

    R_min = tf.reduce_min(R, keep_dims=False)
    R_max = tf.reduce_max(R, keep_dims=False)
    T_min = tf.reduce_min(T, keep_dims=False)
    T_max = tf.reduce_max(T, keep_dims=False)

    R_bin_size = (R_max - R_min) / win
    T_bin_size = (T_max - T_min) / win

    # compute bins
    R_bin_window = tf.range(R_min + 0.5 * R_bin_size, R_min + 0.5 * R_bin_size + R_bin_size * win - eps, delta=R_bin_size)
    T_bin_window = tf.range(T_min + 0.5 * T_bin_size, T_min + 0.5 * T_bin_size + T_bin_size * win - eps, delta=T_bin_size)

    R_mesh = tf.tile(tf.reshape(R_bin_window, (-1, 1)), multiples=[1, win])
    T_mesh = tf.tile(tf.reshape(T_bin_window, (1, -1)), multiples=[win, 1])
    R_T_mesh = tf.concat([tf.reshape(R_mesh, (-1, 1)), tf.reshape(T_mesh, (-1, 1))], axis=-1)
    R_T_mesh = R_T_mesh[tf.newaxis, tf.newaxis, tf.newaxis, :, :]

    p_l_k = 1/(np.sqrt(2 * np.pi)) * tf.exp(-0.5 * (tf.square((R - R_T_mesh[..., 0])/R_bin_size) + tf.square((T - R_T_mesh[..., 1])/T_bin_size)))

    p_l_k = tf.reduce_sum(p_l_k, axis=(0, 1, 2, 3)) / im_size
    p_l_k = p_l_k / tf.reduce_sum(p_l_k)
    p_l_k = tf.reshape(p_l_k, (win, win))
    p_l = tf.reduce_sum(p_l_k, axis=0)
    p_k = tf.reduce_sum(p_l_k, axis=1)

    pl_pk = p_l[:, tf.newaxis] * p_k[tf.newaxis, :]

    mi = p_l_k * tf.log(p_l_k / pl_pk)

    mi = tf.where(tf.is_finite(mi), mi, tf.zeros_like(mi))
    mi = -tf.reduce_sum(mi)
    return mi

the idea is using Parzen Window estimation of MutualInformation, as of which used in MattesMutualInformation of ITK, but replace the B-Spline window function with a Gaussian window function

argman commented 5 years ago

Does anyone see this ?

adalca commented 5 years ago

hi @argman , Yes, thank you for the code -- we're just in the middle of deadline season (MICCAI is on tuesday) and most of us are working on that. We'll be happy to get back to you after this!

argman commented 5 years ago

ok, thanks

John1231983 commented 5 years ago

@argman :Any progress on the NMI loss? Thanks.

adalca commented 5 years ago

@John1231983 we've been sending an experimental one, if you email me I can send it to you as well. I can also work this wekeend on adding it to the losses, with the caveat that we haven't tested it as thoroughly.

John1231983 commented 5 years ago

@adalca : Great. Please send it to my email john1231983@gmail.com. I will also test it too