berlino / tensor2struct-public

Semantic parsers based on encoder-decoder framework
MIT License
90 stars 23 forks source link

Fitting a general PCFG for all the DBs #9

Closed awasthiabhijeet closed 2 years ago

awasthiabhijeet commented 2 years ago

Hi @berlino ,

https://github.com/berlino/tensor2struct-public/blob/cbe8785f2cc98a296f09d14d4529a706f36b52ac/experiments/sql2nl/scripts/sample_synthetic_data_spider.py#L44

Do you have any suggestions to modify this code for fitting a general PCFG over all the DBs? I tried simply recording productions for all the trees. But during sampling, I guess the current code has no way differentiate col-i of one schema from col-i of another schema.

Any suggestions/pointers would be greatly appreciated. Thanks for open-sourcing this code! :)

berlino commented 2 years ago

I was thinking that maybe we can learn the weights of non-terminal rules on all dbs, and use some random weights for terminal rules when used for a particular database. By terminal rules, I mean those involves columns and tables.