RuoyuChen10 / Shape_Context_Matching

Shape Context Matching
MIT License
35 stars 2 forks source link

Rotation Invariant? #2

Open ahnHeejune opened 1 year ago

ahnHeejune commented 1 year ago

Your SCM implementation is very effient and good mostly but I think it miss the ROTATION invariance property. The rotation invariance and scale invariant is important properties in most application.

1) Did you implemented Rotation-Invariance? if yes, which line is implemting it? 2) if not, I recommend you take as the reference angles for the most crowded angle bins. What the orignal paper say about it?

RuoyuChen10 commented 1 year ago

Your SCM implementation is very effient and good mostly but I think it miss the ROTATION invariance property. The rotation invariance and scale invariant is important properties in most application.

  1. Did you implemented Rotation-Invariance? if yes, which line is implemting it?
  2. if not, I recommend you take as the reference angles for the most crowded angle bins. What the orignal paper say about it?

Thanks. Sorry, my code comments are all in Chinese. I think my code contains basic scale invariance and rotation invariance.

Because the shape context histogram matrix needs to be calculated, the matrix is divided into 12 angle areas and 5 distance areas (of course, this code can be set by yourself). Because of the setting of the angle area, this method should be invariant to rotation. For the distance, it will first compute the mean distance and divides it by the mean to reduce scaling sensitivity, so scaling invariance is also achieved. You can see the shape_context_matching.ipynb 10th cell:

def Shape_Context(points, angle=12, distance=[0,0.125,0.25,0.5,1.0,2.0]):
    """
    Construction of shape context histogram matrix
    points: sampling point, shape[N,2]
    angle:  The number of divided angle regions
    distance: Divided distance regions
    """
    # Compute the Euclidean distance matrix
    N = points.shape[0]
    dist = np.sqrt(np.sum(np.square(points.reshape((1,N,2)) - points.reshape((N,1,2))), axis=-1))

    # distance mean
    mean_dist = np.sum(dist) / (N*N-N)
    # Divide by mean to reduce scaling sensitivity
    dist = np.log(dist/mean_dist+0.000000000001) + np.eye(N, dtype=int) * 999
    # print(dist)

    # Compute the angle
    theta = np.arctan((points[:,1].reshape(1,N)-points[:,1].reshape(N,1))/(points[:,0].reshape(1,N)-points[:,0].reshape(N,1)+0.000000000001))/math.pi + ((points[:,0].reshape(1,N)-points[:,0].reshape(N,1))<0).astype(int) + 0.5   # range(0, 2)

    histogram_feature = np.zeros((N, angle, len(distance)))

    for i in range(angle):
        # angle range
        angle_matrix = (theta > (2/angle * i)) * (theta <= (2/angle * (i+1)))
        for j in range(1,len(distance)):
            distance_matrix = (dist < distance[j]) * (dist > distance[j-1])

            histogram_feature[:,i,j-1] = np.sum(angle_matrix * distance_matrix, axis = 1)
    return histogram_feature
RuoyuChen10 commented 1 year ago

Your SCM implementation is very effient and good mostly but I think it miss the ROTATION invariance property. The rotation invariance and scale invariant is important properties in most application.

  1. Did you implemented Rotation-Invariance? if yes, which line is implemting it?
  2. if not, I recommend you take as the reference angles for the most crowded angle bins. What the orignal paper say about it?

By the way, do you need the English Version of file shape_context_matching.ipynb?

ahnHeejune commented 1 year ago

Not necessarily. But you mean rotation insensitve to 15 degress level? The journal paper of scm mentions complete rotation invariant. Which can be achieved by setting the tangent at the point as x axis. Do you understand what I mean?

RuoyuChen10 commented 1 year ago

ROTATION invariance

Not necessarily. But you mean rotation insensitve to 15 degress level? The journal paper of scm mentions complete rotation invariant. Which can be achieved by setting the tangent at the point as x axis. Do you understand what I mean?

I see, in paper:

Belongie, Serge, Greg Mori, and Jitendra Malik. "Matching with shape contexts." Statistics and Analysis of Shapes. Birkhäuser Boston, 2006. 81-105.

In the shape context framework, we can provide for complete rotation invariance if this is desirable for an application. Instead of using the absolute frame for computing the shape context at each point, one can use a relative frame, based on treating the tangent vector at each point as the positive x-axis. In this way the reference frame turns with the tangent angle, and the result is a completely rotation-invariant descriptor. However, it should be emphasized that in many applications complete invariance impedes recognition performance, e.g., when distinguishing 6 from 9, rotation invariance would be completely inappropriate. Another drawback is that many points will not have well-defined or reliable tangents. Moreover, many local appearance features lose their discriminative power if they are not measured in the same coordinate system.

Sorry, this part is not implemented, I will consider adding this function, thank you for your suggestion.

RuoyuChen10 commented 1 year ago

2. I recommend you take as the reference angles for the most crowded angle bins.

I don't understand "I recommend you take as the reference angles for the most crowded angle bins."?

Is any mentioned in the paper?

ahnHeejune commented 1 year ago

Someone did like this. https://medium.com/machine-learning-world/shape-context-descriptor-and-fast-characters-recognition-c031eac726f9

Zireael07 commented 11 months ago

The guy from Medium just rotates the output relative to the most extreme angles.