Closed Hari-Dorbala closed 1 week ago
The schema filter is not a part of the pre-trained CodeS model. We have released it as an independent component. For further details, please visit the GitHub repository at: https://github.com/RUCKBReasoning/text2sql-schema-filter
For the problem of training label imbalance, please refer to our other paper RESDSQL(https://arxiv.org/abs/2302.05965). Alleviating the bias by using focal loss.
Hi. I am trying to understand how to use the schema filtering part. From the given entire schema of the database, how can I get the relevant tables and columns required for a given NL query? How should I prepare my data to get there? I have seen that the RoBERTa is used in schema_filter.py, but I cannot understand how the data needs to be prepared (I have done labeling, but the data is highly imbalanced; how do I approach this problem?) Can you please explain how schema filtering is handled in CodeS?