tsujuifu / pytorch_graph-rel

A PyTorch implementation of GraphRel
MIT License
268 stars 54 forks source link

Original Dataset #7

Open yin-hong opened 4 years ago

yin-hong commented 4 years ago

Hello! Can you share nyt and webnlg original dataset containing train, dev, test ? Thanks a lot !

tsujuifu commented 4 years ago

Hi Michael, I get the dataset from here.

yin-hong commented 4 years ago

Hi Michael, I get the dataset from here.

Thanks for your reply! I have downloaded this dataset. However, I find the entity type is not annotated in webnlg dataset. How do you solve this problem?

tsujuifu commented 4 years ago

Hi, Michael.

For the original WebNLG dataset, there is no entity type tag. (But for NYT, there should be.) And for the joint extraction of entity and relation task, we only care about the relation type and the positions of two entities, hence we don't need the tag of the entity type.

Sincerely, Tsu-Jui

michael-hon notifications@github.com 於 2019年9月28日 週六 下午7:26寫道:

Hi Michael, I get the dataset from here https://github.com/xiangrongzeng/copy_re.

Thanks for your reply! I have downloaded this dataset. However, I find the entity type is not annotated in webnlg dataset. How do you solve this problem?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/tsujuifu/pytorch_graph-rel/issues/7?email_source=notifications&email_token=AJKWMAUTAMID3AWBX25C2PTQMAG43A5CNFSM4I3MRMB2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD73GK3Q#issuecomment-536241518, or mute the thread https://github.com/notifications/unsubscribe-auth/AJKWMAUUWBK3OLWMNDHLSHTQMAG43ANCNFSM4I3MRMBQ .

yin-hong commented 4 years ago

Hi, Michael. For the original WebNLG dataset, there is no entity type tag. (But for NYT, there should be.) And for the joint extraction of entity and relation task, we only care about the relation type and the positions of two entities, hence we don't need the tag of the entity type. Sincerely, Tsu-Jui michael-hon notifications@github.com 於 2019年9月28日 週六 下午7:26寫道: Hi Michael, I get the dataset from here https://github.com/xiangrongzeng/copy_re. Thanks for your reply! I have downloaded this dataset. However, I find the entity type is not annotated in webnlg dataset. How do you solve this problem? — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#7?email_source=notifications&email_token=AJKWMAUTAMID3AWBX25C2PTQMAG43A5CNFSM4I3MRMB2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD73GK3Q#issuecomment-536241518>, or mute the thread https://github.com/notifications/unsubscribe-auth/AJKWMAUUWBK3OLWMNDHLSHTQMAG43ANCNFSM4I3MRMBQ .

Therefore, the loss function doesn't contain entity loss but only contain relation loss ?

tsujuifu commented 4 years ago

Noop, it contains both entity and relation loss.

While for entity, I only care that a word belongs to (B, I, E, S, O). B: begin word of an entity I: inner word of an entity E: end word of an entity S: this word is a single-word entity O: this word does not belong to entity

Hence, the entity loss is from 5-class classification.

michael-hon notifications@github.com 於 2019年9月28日 週六 下午7:46 寫道:

Hi, Michael. For the original WebNLG dataset, there is no entity type tag. (But for NYT, there should be.) And for the joint extraction of entity and relation task, we only care about the relation type and the positions of two entities, hence we don't need the tag of the entity type. Sincerely, Tsu-Jui michael-hon notifications@github.com 於 2019年9月28日 週六 下午7:26寫道: … <#m-4239379777234311174> Hi Michael, I get the dataset from here https://github.com/xiangrongzeng/copy_re. Thanks for your reply! I have downloaded this dataset. However, I find the entity type is not annotated in webnlg dataset. How do you solve this problem? — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#7 https://github.com/tsujuifu/pytorch_graph-rel/issues/7?email_source=notifications&email_token=AJKWMAUTAMID3AWBX25C2PTQMAG43A5CNFSM4I3MRMB2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD73GK3Q#issuecomment-536241518>, or mute the thread https://github.com/notifications/unsubscribe-auth/AJKWMAUUWBK3OLWMNDHLSHTQMAG43ANCNFSM4I3MRMBQ .

Therefore, the loss function doesn't contain entity loss but only contain relation loss ?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/tsujuifu/pytorch_graph-rel/issues/7?email_source=notifications&email_token=AJKWMAQULB42V2AECPB5O3LQMAJJTA5CNFSM4I3MRMB2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD73GT3I#issuecomment-536242669, or mute the thread https://github.com/notifications/unsubscribe-auth/AJKWMARMGGRXGQBM7NFJCH3QMAJJTANCNFSM4I3MRMBQ .

yin-hong commented 4 years ago

Noop, it contains both entity and relation loss. While for entity, i only care that a word belongs to (B, I, E, S, O). B: begin word of an entity I: inner word of an entity E: end word of an entity S: this word is a single-word entity O: this word does not belong to entity Hence, the entity loss is from 5-class classification. michael-hon notifications@github.com 於 2019年9月28日 週六 下午7:46 寫道: Hi, Michael. For the original WebNLG dataset, there is no entity type tag. (But for NYT, there should be.) And for the joint extraction of entity and relation task, we only care about the relation type and the positions of two entities, hence we don't need the tag of the entity type. Sincerely, Tsu-Jui michael-hon @.*** 於 2019年9月28日 週六 下午7:26寫道: … <#m-4239379777234311174> Hi Michael, I get the dataset from here https://github.com/xiangrongzeng/copy_re. Thanks for your reply! I have downloaded this dataset. However, I find the entity type is not annotated in webnlg dataset. How do you solve this problem? — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#7 <#7>?email_source=notifications&email_token=AJKWMAUTAMID3AWBX25C2PTQMAG43A5CNFSM4I3MRMB2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD73GK3Q#issuecomment-536241518>, or mute the thread https://github.com/notifications/unsubscribe-auth/AJKWMAUUWBK3OLWMNDHLSHTQMAG43ANCNFSM4I3MRMBQ . Therefore, the loss function doesn't contain entity loss but only contain relation loss ? — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#7?email_source=notifications&email_token=AJKWMAQULB42V2AECPB5O3LQMAJJTA5CNFSM4I3MRMB2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD73GT3I#issuecomment-536242669>, or mute the thread https://github.com/notifications/unsubscribe-auth/AJKWMARMGGRXGQBM7NFJCH3QMAJJTANCNFSM4I3MRMBQ .

Thanks for your reply ! I think I have fully understood your thought.

zhihuatao commented 4 years ago

hello,could you please tell me how to realize the dataset pre_tr?

Wangyandong-master commented 4 years ago

hello,could you please tell me how to realize the dataset pre_tr?

Hello,have get the input files? Thank you lot.

weizhepei commented 4 years ago

Noop, it contains both entity and relation loss. While for entity, I only care that a word belongs to (B, I, E, S, O). B: begin word of an entity I: inner word of an entity E: end word of an entity S: this word is a single-word entity O: this word does not belong to entity Hence, the entity loss is from 5-class classification. michael-hon notifications@github.com 於 2019年9月28日 週六 下午7:46 寫道: Hi, Michael. For the original WebNLG dataset, there is no entity type tag. (But for NYT, there should be.) And for the joint extraction of entity and relation task, we only care about the relation type and the positions of two entities, hence we don't need the tag of the entity type. Sincerely, Tsu-Jui

@tsujuifu Thanks for the clarification. I'm trying to reproduce your excellent work but I have some trouble in the preparation of the dataset. I checked the preprocessed dataset released by CopyR [Zeng , 2018] and find the annotated entities are all single-word entities. In this case, should all the entity tags belong to 'B' when I prepare the training data for the Graph_rel model? Is there any plan to open the preprocessed dataset?

131250208 commented 4 years ago

Hi~, CopyR uses the version only annotating the last word, do you also follow this preprocessing setting? Or do you preprocessing on the original dataset released by CopyR and annotating the whole span? Thanks for your reply~