Open liuyeah opened 2 years ago
We're glad you like our work. Thank you! The situation you mentioned occurs in the initial iteration of the algorithm. As the iteration progresses, the b_ij will become different from each other as the update mechanism works in Line 6. The e_i is the concept embedding and is initialed with different word embeddings.
We're glad you like our work. Thank you! The situation you mentioned occurs in the initial iteration of the algorithm. As the iteration progresses, the b_ij will become different from each other as the update mechanism works in Line 6. The e_i is the concept embedding and is initialed with different word embeddings.
Thank you for your reply! In my opinion, if the relation between each child class j and the parent concept i is different, the updated b_ij will become different after the iteration.
However, in this paper, I haven't found how to build the relation between the parent concept and its child class. If the relation obey that the child class is related with the concepts of its parent class, the learned child class embedding will be the same for all child class with the same parent class.
Besides, in Introduction, you mentioned "An external knowledge source is taken as an initial reference of the child classes". However, I don't understand how to utilize the external embedding to initialize the child class in this algorithm.
What's more, I don't know if could speak Chinese. If so, maybe we can communicate more conveniently!
I hope you can reply to my question. Thank you very much!
Sincerely
希望以上能解答您的问题
- 如果不初始化子类的embedding,通过最终的分类信号能监督学习到每个子类的embedding,它们最终会有差异,可以类比于分类层的权重矩阵;如果没记错的话,在胶囊网络中,每层胶囊的embedding是随机初始化的,在监督信号回传后,随着迭代学习的进行,胶囊embedding就会学习到差异,在其论文中有可视化的展示,在我们论文中的3.6部分也有可视化的展示;
- 如果初始化了子类的embedding就更具有语义性,在算法初始化阶段它们也更具有区分性,可以取得稍好的效果;初始化的说明在2.2节(5013页)的末尾,相关消融实验结果在3.5节的倒数第二段;这里的初始化是使用了百科知识对于这个类别的定义(2.3节),在文中也有说明。
希望以上能解答您的问题
感谢您的答复!不知道如果方便的话可否留一下您的邮箱方便以后请教问题。您论文里留的腾讯邮箱似乎已经失效了。
非常感谢!
This work is so interesting and inspiring! However, I have a question about the Dynamic Routing mechanism in your work.
In this algorithm, you first set all bij to zero (line 1). And in the iteration, you first conduct the softmax to bi (line 3). If I understand correctly, the softmax result is a uniform distribution. As the result, the learned child class embedding of the same parent class will be the same.
I hope you can reply to my question. Thank you very much!