Closed FarisXiong closed 4 months ago
Hi, thanks for your interest on our work. Actually our proposed approach is to accelerate the model computation for training and inference that does not require visualization of the intermediate results. if one needs to obtain the N*N attention matrix for visualization, one has to explictly calculate the all-pair weights. There is no algorithm for detouring the quadratic complexity in this case.
For how to obtain the N*N attention matrix, one can simply modify our code. Though we do not provide the version for NodeFormer, here is our implementation for DIFFormer which can be a helpful reference:
Thank you for your response and the solution you provided! I really appreciate your help.
I am currently examining the methodologies presented in your paper and am unclear about the techniques used to visualize Graphs 4 and 7 under linear complexity constraints. According to Equation 7, it seems infeasible to compute the attention scores between any two nodes using a linear approximation. Could you please clarify the specific approach utilized for these visualizations under the mentioned constraints? Additionally, if there are any potential modifications or alternative methods that could be recommended to handle these calculations more feasibly, I would appreciate your insights.
$$z{(l+1)u} \approx \frac{\phi(q{u}/\sqrt{\tau})^T\sum{\nu=1}^N e^{g{\nu}/\tau} \phi(k{\nu}/\sqrt{\tau}) v{\nu}^T}{\phi(q{u}/\sqrt{\tau})^T\sum{\omega=1}^N e^{g{\omega}/\tau} \phi(k{\omega}/\sqrt{\tau}) } $$