Closed jczhang02 closed 8 months ago
@jczhang02 Hey JC do you still need help? This formula is a simplified take of the $(QK^T)V$, if one assumes that:
Each row of $Q, K, V$ corresponds to a position $x_i$ in a "discretization", as such, the $j$-th row of $Q$ is equal to $\vec{q}(x_i)$ where $\vec{q}(\cdot)$ is a vector-valued feature map.
Denote $\kappa(x_i, \xi_j):= \vec{q}(x_i)\cdot \vec{k}(x_j)$, this matrix is an evaluation of the Green's function, or kernel (it characterizes how two "points" interact)
Hi, @scaomath. I can not figure out how the first item in Eq.(9) appears before I understand the meaning of "skip-connection". So, Eq.(9) is the simplified take of $(QK^T)V$ with residuals?
Besides, I think the job is pretty awesome! I haven't seen such awesome ideas in the topic of operator learning and your very well-maintained codebase.
As you know, AI4Science is still a niche research topic, limited by the fact that almost no one is very good at both partial differential equations and deep learning. I'm also often torn between continuing my research on this topic because I can't find friends to discuss and learn with.
If so, can I get your contact info? For example, email, telegram, wechat, and so on. You can send it via email (my address: jczhang@live.it).
Hi, @scaomath. I can not figure out how the first item in Eq.(9) appears before I understand the meaning of "skip-connection". So, Eq.(9) is the simplified take of (QKT)V with residuals?
You can view this as a special and simplified case of $Z \gets V + (QK^T)V$ (plus other operations), while equation (9) is normally how integral equation is written.
Besides, I think the job is pretty awesome! I haven't seen such awesome ideas in the topic of operator learning and your very well-maintained codebase.
As you know, AI4Science is still a niche research topic, limited by the fact that almost no one is very good at both partial differential equations and deep learning. I'm also often torn between continuing my research on this topic because I can't find friends to discuss and learn with.
If so, can I get your contact info? For example, email, telegram, wechat, and so on. You can send it via email (my address: jczhang@live.it).
My email is scao@umkc.edu.
Below are some more recent developments with PDE operator learning using Transformers: https://openreview.net/forum?id=EPPqt3uERT https://arxiv.org/abs/2302.14376 https://www.sciencedirect.com/science/article/abs/pii/S0021999124001931
I want to ask a question about raw paper, that is: How to derive the equation below from scaled dot-product?