DSE-MSU / DeepRobust

A pytorch adversarial library for attack and defense methods on images and graphs
MIT License
995 stars 192 forks source link

Quesiton abort computer consine similarity. #45

Closed shaolyy closed 3 years ago

shaolyy commented 3 years ago

https://github.com/DSE-MSU/DeepRobust/blob/d4ebe2d22d2378335f7558a1389dd3e13910fbb2/deeprobust/graph/defense/gcn_preprocess.py#L287-L289 Q1:In L288, Features[n1] and features[n2] may be a and b? Q2:Given two vectors of attributes, A and B, the cosine similarity, cos(θ), is represented using a dot product and magnitude as So, in L289 np.sqrt(np.square(a).sum() + np.square(b).sum()) Why does it use addition instead of multiplication?

ChandlerBang commented 3 years ago

Hi,

Thanks for mentioning it! You are right. I didn't test the cosine similarity function before. I've just updated it and now it should be correct. See commit #https://github.com/DSE-MSU/DeepRobust/commit/02102760529e30ac3b08e16b6cf08ab21b19af38

EdisonLeeeee commented 3 years ago

https://github.com/DSE-MSU/DeepRobust/blob/02102760529e30ac3b08e16b6cf08ab21b19af38/deeprobust/graph/defense/gcn_preprocess.py#L262-L293

Hi, your work is awesome,

but I think this part of the code can be accelerated using Numpy matrix multiplication, like this ( maybe I should make a pull request :P)

def jaccard_similarity(A, B):
    intersection = np.count_nonzero(A * B, axis=1)
    J = intersection * 1.0 / (np.count_nonzero(A, axis=1) + np.count_nonzero(B, axis=1) + intersection + 1e-10)
    return J

def cosine_similarity(A, B):
    inner_product = (A * B).sum(1)
    C = inner_product / (np.sqrt(np.square(A).sum(1)) * np.sqrt(np.square(B).sum(1)) + 1e-10)
    return C

rows, cols = modified_adj.nonzero()
data = modified_adj.data
A = features[rows]
B = features[cols]
S = cosine_similarity(A, B)
# or
# S = jaccard_similarity(A, B)

remained = np.where(S < threshold)[0]
removed_cnt  = remained.size

rows = rows[remained]
cols = cols[remained]
data = data[remained]

modified_adj = scipy.sparse.csr_matrix((data, (rows, cols), modified_adj.shape))
ChandlerBang commented 3 years ago

Hi,

Thank you for pointing it out! Actually _drop_dissimilar_edges() is deprecated and I used njit to accelerate the process, see details at https://github.com/DSE-MSU/DeepRobust/blob/02102760529e30ac3b08e16b6cf08ab21b19af38/deeprobust/graph/defense/gcn_preprocess.py#L314-L349

But also feel free to pull request to modify the original version!