wenhuchen / LogicNLG

The data and code for ACL2020 paper "Logical Natural Language Generation from Open-Domain Tables"
MIT License
166 stars 22 forks source link

How are the [ENT] tokens replaced with entity names? #10

Closed vnik18 closed 3 years ago

vnik18 commented 3 years ago

In the coarse-to-fine generation scheme, after generating the template containing [ENT] tokens, how are the [ENT] tokens replaced with the actual entity names. It is not clear in the paper. Can you please elaborate where this is done in the code? Thank you.

wenhuchen commented 3 years ago

It's not a rule-based process, it's finished by a model. If you run the following command

CUDA_VISIBLE_DEVICES=0 python GPT2-coarse-to-fine.py --do_test --load_from models/GPT_stage2_C2F_ep13.pt

In https://github.com/wenhuchen/LogicNLG/blob/ffa8e2637e741855b46c9d9663018fb1d1a57585/GPT2-coarse-to-fine.py#L266 The original decoded text is in the form 'MASKED SENTENCE [SEP] REAL SENTENCE', in L267, I find the [SEP] token and then only extract out the REAL SENTENCE.

The whole process of generating [ENT] and fill in the [ENT] is all done by the same GPT2 model. All details are in https://github.com/wenhuchen/LogicNLG/blob/master/GPT2-coarse-to-fine.py

vnik18 commented 3 years ago

Thank you