512 and 1024? - Githubissues

thunlp / PL-Marker

Source code for "Packed Levitated Marker for Entity and Relation Extraction"

MIT License

260 stars 35 forks source link

512 and 1024? #23

Closed Jay0412 closed 2 years ago

Jay0412 commented 2 years ago

As I know, BERT is limit the position embedding as 512. However, when I look at the code, I found position id, input id and etc. have 1024 size. I quite confusing about this concept. Could you explain about the difference above those?

Jay0412 commented 2 years ago

+) I have one more question. As I understand, I think when doing relation extraction this model uses levitated marker. However, in your script about training re, it uses run_re.py file which seems they don't use a levitated marker. I wonder what is the difference between run_re.py and run_levitatedpair.py. Also, could you explain what's the exact meaning of BertForACEBothSub, BertForACEBothOneDropoutSpanSub, BertForACEBothOneDropoutSub, BertForACEBothOneDropoutLeviPair, BertForACEBothOneDropout classes?

YeDeming commented 2 years ago

run_levitatedpair.py is for ablation study (two pairs of leviteatd marker) BertForACEBothSub, BertForACEBothOneDropoutSpanSub, BertForACEBothOneDropoutSub, BertForACEBothOneDropoutLeviPair, BertForACEBothOneDropout are a serious of attempts, we finally use the default BertForACEBothOneDropoutSub.

Jay0412 commented 2 years ago

Thank you for answering. Okay, then does it means BERTForACEBothOneDropoutSub uses levitated marker for relation extraction? Also, I still want to know about the first question(512 and 1024), could you explain it plz?

YeDeming commented 2 years ago

BERTForACEBothOneDropoutSub uses a hybrid of solid marker and levitated marker for relation extraction.
We set the levitated markers share the same postion id as span boundary tokens. For example, tokens: I like apple pie position id: 0 1 2 2 3 3

Though the sequence length > 512, the postion id ranges from 0-512

Jay0412 commented 2 years ago

Thanks, I understand both of them. As you explained, I also think position id ranges from 0-512. However, when I print out the position id shape in item, it is 1024. I can't understand, how it can be possible that it has a 1024 shape and also others (input_ids, attention and etc.)

YeDeming commented 2 years ago

Transformer can support any length.

Jay0412 commented 2 years ago

Sorry, I'm a beginner of study NLP... Could you explain in more detail? I don't get it... How transformer can support any length even though, BERT limits the position id ranges from 0-512?

YeDeming commented 2 years ago

Yes.

Jay0412 commented 2 years ago

Well...so,,, any explnation???

YeDeming commented 2 years ago

BERT limits the position id ranges from 0-512. But we can use the same postion id for different tokens. For example, set 1024 tokens' postion = 0

Jay0412 commented 2 years ago

Ah..! I understand! Thank you very much for your kind reply :)