Yujun-Yan / Heterophily_and_oversmoothing

Codes for "Two Sides of the Same Coin: Heterophily and Oversmoothing in Graph Convolutional Neural Networks"
MIT License
32 stars 8 forks source link

Questions about the theory in the mausctript #3

Open With-the-sun opened 11 months ago

With-the-sun commented 11 months ago

Hi! Thanks for the valuable research about Heteropyily and Oversmoothing in GCN. I am aware of some questions about the manuscript, could I communicate with you, for instance, about the metric's define of 'Relative Degree' at the initial layer, and the 'effective relative degree' . For the initial layer, all neighbor nodes j are considered. However, for deeper layer l, only the positively neighbor nodes j are considered. Why are negative contributing neighbors not considered in Deeper Layer, but considered in the initial layer?

With-the-sun commented 11 months ago

and the second question, at Section3.3.4 in the manuscript, about 'Relation between the problems' for '(2)-In homophilous graphs', why the homophilous graph have node's label is 'Class 2' ? Shouldn't the nodes of the homophilous graph all be of the same label 'Class 1' ?

With-the-sun commented 11 months ago

and the third question. as the layer be deeper, the i node's 'effective homophily' would get lower?
thanks

With-the-sun commented 11 months ago

and the fouth question. At the Equ(41), The symbol in the red box should be a minus sign? image

Yujun-Yan commented 11 months ago

Thanks for your interest in our paper.

  1. For both initial and deeper layers, we consider positive and negative contributing nodes. In our proofs (A.1 and A.2), we use whether a neighbor's representation at k-th layer contribute positively or negatively as the condition, and decompose the total expectation into the sum of conditional expectations, e.g. (Eq. 12 and second equality in Eq. 34).
Yujun-Yan commented 11 months ago
  1. You are right that in heterophilous graphs, most nodes belong to class 1 (e.g., table 3, Texas, Wisconsin, Actor & Cornell). However, in some real datasets (e.g. table 3, Squirrel & Chameleon), we also found non-neglectable # of nodes in class 2. That's why we analyze the phenomenon in heterophilous graphs by considering both nodes in class 1 and class 2.
Yujun-Yan commented 11 months ago
  1. yes, the effective homophily will get lower in heterophilous graphs. We have experiments provided in the appendix to show that: table B.3
Yujun-Yan commented 11 months ago
  1. It should be a + here because the two negative signs offset each other in equation 40. And the total expectation (equation 41) should be a sum of equation 39 and equation 40. That's why we have a positive sign instead of negative sign there
With-the-sun commented 10 months ago

Thanks for your interest in our paper.

  1. For both initial and deeper layers, we consider positive and negative contributing nodes. In our proofs (A.1 and A.2), we use whether a neighbor's representation at k-th layer contribute positively or negatively as the condition, and decompose the total expectation into the sum of conditional expectations, e.g. (Eq. 12 and second equality in Eq. 34).

Thanks for your response. And i have learned the appendices except A.4. The reply for my questions 2-4 are well understood. And i have learned the appendices except A.4) For the Question 1: 1.1- How to define 'positive contribution' or 'negative contribution'? Is it based on the label after classification? Is it mentioned in 'B.3 Measurement of effective homophily'? image

1.2- the positive and negative contributing nodes are indeed considered in proofs (A.1 and A.2). However, for the definitions of 'relative degree' , the range of j is the i node's neighbor, but why in the 'effective relative degree' of 3.3.2 (2), the domain of j is only the positive neighborhood? What is the theoretical basis of the definition of 'effective relative degree'? Is it a subjective variable that matches proof A.2 ?

With-the-sun commented 10 months ago

I have a preliminary understanding of most theories about appendices (except A.4). But I still have some doubts. and I hope you could support me. Sorry to have troubled you. 5.1- In 3.3.4 "Oversmoothing problem", Case1 and Case2 trigger over-smoothing? What is the definition or quantification of over-smoothing? 5.2- In this article, over-smoothing means that 'effective homophily' is reduced to a lower limit(in other papers, the over-smoothing problem is more about the Dirichlet energy)? What is the connection or difference between 'effective homophily' and 'Dirichlet energy', namely, low h is low Dirichlet energy? Case1 is low h? image

With-the-sun commented 10 months ago
  1. In 3.3.4, The node of Case2 transforms into Case1 in deep layers, resulting in 'pseudo-heterophily'. However, if the homophily h=1 graph data is taken into account, namely only Class1 in graph data. Is there still 'pseudo-heterophily'? If not, why is GCN still too smooth on homophily h=1 graph data ?
With-the-sun commented 10 months ago

7.1- In 3.3.3, 'the movement of representations towards the other class' is it Case2? 7.2- In 3.3.3, What is the meaning of 'opposeite classes swap places', can there be A general, summative conclusion, rather than having to refer to the theoretical reasoning of A.4. image

With-the-sun commented 10 months ago
  1. In 4.1, formula 7 represents the weighting of messages on different nodes (i.e., giving edge weights τ), why was the definition of the relative degree of 'Thm 3.1' changed? Is this mathematically equivalent to putting τ in the formula 14? image
With-the-sun commented 10 months ago

9.1- In 4.1, for the 'Strusture-based Edge Correction', since λ0 and λ1 are obtained by learning, how to ensure that formula (8) satisfies to 'Intuitively, when r is small, we would like to compensate for it via a larger τ '? Just because the r of formula (8) is in the denominator? image

9.2- In 4.1, edge weights are learned at each layer. Moreover, GAT's edge weights are also learned at each layer, but GAT cannot learn edge symbols; So, is the edge symbols an important reason why GGCN is better than GAT?

With-the-sun commented 10 months ago

10.1- In A.1.Proof of Theorem 3.1, Is the symbol in the gray part of the figure a negative sign? image

10.2- Formula (51), whether the gray part should have a u? image 10.3- Formula (27), whether the gray part should be fj? image

With-the-sun commented 10 months ago

I wonder whether I have the honor to have your contact information? e.g. chat app or others