awslabs / gap-text2sql

GAP-text2SQL: Learning Contextual Representations for Semantic Parsing with Generation-Augmented Pre-Training
https://arxiv.org/abs/2012.10309
Apache License 2.0
101 stars 24 forks source link

Generators used for pre-training #13

Open wiskojo opened 3 years ago

wiskojo commented 3 years ago

The paper mentions the use of a SQL-to-Text and Table-to-Text model to generate synthetic samples for pre-training. I would like to use these models to try generate synthetic training examples for my own custom datasets. It doesn’t seem like the weights for these models were made public, is there any way I can train these models myself? I saw some code under relogic and pretrainkit which seems relevant for this but couldn’t figure out what data it uses and how to run it. Thanks!

PedroEstevesPT commented 3 years ago

I also tried checking for the pre-train generators but no clue @wiskojo. Also GraPPa, which also uses a data augmentation strategy, so far has not made their code available.

Impavidity commented 3 years ago

For the generator code, you can checkout https://github.com/awslabs/gap-text2sql/blob/main/relogic/sql-to-text-train.py and https://github.com/awslabs/gap-text2sql/blob/main/relogic/entity-to-text-train.py, which are sql to text generator and table to text generator. Will upload some data samples to help the understanding.

Fheon commented 3 years ago

For the generator code, you can checkout https://github.com/awslabs/gap-text2sql/blob/main/relogic/sql-to-text-train.py and https://github.com/awslabs/gap-text2sql/blob/main/relogic/entity-to-text-train.py, which are sql to text generator and table to text generator. Will upload some data samples to help the understanding.

Can you upload a README about how to run SQL-to-Text? Thanks a lot.