drethage / fully-convolutional-point-network

Fully-Convolutional Point Networks for Large-Scale Point Clouds
MIT License
85 stars 22 forks source link

Spatial pool weights. Some clarifications needed. #8

Open decadenza opened 3 years ago

decadenza commented 3 years ago

Hi. I'm trying to understand the function here https://github.com/drethage/fully-convolutional-point-network/blob/5942685ea5b1c797fcc28e17be6d9ad48bf0f0e4/fcpn.py#L115 Not clear about what it does.

Below I try to go through the algorithm.

def get_spatial_pool_weighting(sphere_radius, top_level_centroid_locations):
        """ Compute a spatial weighting for every cell.
            Args:
            sphere_radius: float, the weight of neighboring cells will be greatest on this sphere's surface
            top_level_centroid_locations: tf.tensor, locations of cells in metric space
            Returns: tf.tensor
        """
       top_level_centroid_locations_repeated = tf.tile(tf.expand_dims(top_level_centroid_locations, axis=1), [1, top_level_centroid_locations.get_shape()[1].value, 1, 1]) #row-wise repeated sample locations

As top_level_centroid_locations has shape (batch_size, num_points, 3) I expect top_level_centroid_locations_repeated shape to be (batch_size, num_points, num_points, 3)

        difference = tf.subtract(top_level_centroid_locations_repeated, tf.transpose(top_level_centroid_locations_repeated, [0, 2, 1, 3]))
        # Euclidean distance from every centroid to every other centroid
        distance = tf.norm(difference, axis=3, ord=2, keepdims=True)

distance shape should be: (batch_size, num_points, num_points, 1). Is it?

        # Clipped distance in [sphere_radius - 1, sphere_radius + 1] range
        clipped_distance = tf.clip_by_value(distance, sphere_radius - 1, sphere_radius + 1)
        # Neighboring voxels weighting based on (cos(3(x-1.5)) + 1) / 2, max weighting on voxels sphere_radius away from a given voxel
        cos_distance_to_sphere_surface = (tf.cos(3 * (clipped_distance - sphere_radius)) + 1) / 2
        # Normalized weighting
        return cos_distance_to_sphere_surface / tf.reduce_sum(cos_distance_to_sphere_surface, axis=2, keepdims=True)

The output should still be in shape (batch_size, num_points, num_points, 1).

At line https://github.com/drethage/fully-convolutional-point-network/blob/5942685ea5b1c797fcc28e17be6d9ad48bf0f0e4/fcpn.py#L284 features shape should be: (batch_size, num_points, num_points, 512). Doing features * spatial_pooling_weights should output a shape of (batch_size, num_points, num_points, 512) thanks to broadcasting. Is it right?

Thank you in advance.