INK-USC / MHGRN

Scalable Multi-Hop Relational Reasoning for Knowledge-Aware Question Answering (EMNLP 2020)
246 stars 46 forks source link

./data/cpnet/tzw.ent.npy #17

Closed Yueqing-Sun closed 2 years ago

Yueqing-Sun commented 3 years ago

Hi, I noticed that the node features are described in the paper as:

For the input node features, we first use templates to turn knowledge triples in ConceptNet into sentences and feed them into pre-trained BERTLARGE, obtaining a sequence of token embeddings from the last layer of BERT-LARGE for each triple. For each entity in ConceptNet, we perform mean pooling over the tokens of the entity’s occurrences across all the sentences to form a 1024d vector as its corresponding node feature. We use this set of features for all our implemented models.

Since the pre-training model used in the paper is roberta, why use bert instead of roberta to encode nodes? Is there any open source code for the method of obtaining node features used in the paper? I want to try some changes on it. Thanks!

yuchenlin commented 2 years ago

Please check this reimplementation: https://gist.github.com/yuchenlin/b298f37428361f52feb89205cd584312