IndoNLP / indonlu

The first-ever vast natural language processing benchmark for Indonesian Language. We provide multiple downstream tasks, pre-trained IndoBERT models, and a starter code! (AACL-IJCNLP 2020)
https://indobenchmark.com
Apache License 2.0
537 stars 190 forks source link

Textual entailment usage #3

Closed rzsgrt closed 4 years ago

rzsgrt commented 4 years ago

Hi, thanks for publishing this work. Specific to text entailment task, how to use your model for this task since we need to feed two sentence?

SamuelCahyawijaya commented 4 years ago

Hi @rezasugiarto : It is the same as the common way of how BERT handle pair sequence input data basically it is done by the following format:

[CLS]<Text_1>[SEP]<Text_2>[SEP] Where [CLS] denotes classification token, [SEP] denotes separator token, and <Text_1> and <Text_2> denote the pair text.

Following the original BERT model, we also add different token_type embedding for <Text_1> and <Text_2>.

For simplicity, in order to create the aformentioned format and token type ids, we can can be use BertTokenizer.encode_plus() function as shown on: https://github.com/indobenchmark/indonlu/blob/a698339222a5c214e6a693e81f8c66785cf35477/utils/data_utils.py#L463

For further detail regarding to the BERT model you can read the original BERT paper on: https://arxiv.org/abs/1810.04805

rzsgrt commented 4 years ago

Hi @rezasugiarto : It is the same as the common way of how BERT handle pair sequence input data basically it is done by the following format:

[CLS]<Text_1>[SEP]<Text_2>[SEP] Where [CLS] denotes classification token, [SEP] denotes separator token, and <Text_1> and <Text_2> denote the pair text.

Following the original BERT model, we also add different token_type embedding for <Text_1> and <Text_2>.

For simplicity, in order to create the aformentioned format and token type ids, we can can be use BertTokenizer.encode_plus() function as shown on: https://github.com/indobenchmark/indonlu/blob/a698339222a5c214e6a693e81f8c66785cf35477/utils/data_utils.py#L463

For further detail regarding to the BERT model you can read the original BERT paper on: https://arxiv.org/abs/1810.04805

Clear enough, thank you