shijx12 / KQAPro_Baselines

Pytorch implementation of baseline models of KQA Pro, a large-scale dataset of complex question answering over knowledge base.
http://thukeg.gitee.io/kqa-pro/
MIT License
123 stars 22 forks source link

What do <b> and <c> mean in Program? #9

Closed yhshu closed 3 years ago

yhshu commented 3 years ago

I assume that <b> splits functions, and <c> splits arguments of functions, is it true?

Thanks for your reply.

ShulinCao commented 3 years ago

Exactly~

yhshu commented 3 years ago

I haven't actually experimented to the stage of generating results, so there's a question. The transformer tokenizer, i.e., BART tokenizer here, generally separates symbols like < > from words. Is there any post-process for seq2seq results to eliminate the side effect, or deal with the possible ill-format SPARQL or Program?

ShulinCao commented 3 years ago

For SPARQL,there is post-process as shown in https://github.com/shijx12/KQAPro_Baselines/blob/2c5049900ba11947daf32cffcdfbdb4e985f1108/Bart_SPARQL/predict.py#L90.