AlibabaResearch / DAMO-ConvAI

DAMO-ConvAI: The official repository which contains the codebase for Alibaba DAMO Conversational AI.
MIT License
1.1k stars 178 forks source link

Regarding the connection between question tokens and database schema items. #58

Closed 22842219 closed 12 months ago

22842219 commented 1 year ago

Hi, Good work@, especially the idea that introduced the relation between question tokens and schema items, is brilliant I have to say, but I am not quite sure how you established these connections during your graph construction process. What specific technique you have used to achieve that? I apologize if it is a silly question, but could you kindly explain it more, please? Thanks.

Best, Zea

22842219 commented 1 year ago

Hi, this is a question for graphixT5 work.

22842219 commented 1 year ago

I am struggling with this because I think this is the job we want the model to do (aka schema linking?). Could you please clarify this if I get it wrong? Also, if you could reference this piece of work to the code implementation, that would be greatly appreciated. Many thanks!

Best, Zea

22842219 commented 1 year ago

Hi,

It is me again. I know some simple approaches like string match would work for the cases such as the link between "school" and "school id" in the diagram you've shown in the repository. But I don't think it works well in cases, for example, the connection between "nation" appearing in the question and "country" in the database schema. That's why I am curious in this. Sorry for the bump messages. Thanks.

Best, Zea

accpatrick commented 1 year ago

@22842219 Thank you for your interest in our work! We employ three approaches for schema linking in our study: 1) Explicit: a rule-based method, as you mentioned (e.g., school --> school_id); 2) Contextual semantic parsing, which elicits T5 to derive such alignments through semantic equivalence; and 3) Implicit: utilizing multi-hop reasoning to extract linkings, through Graphix training that integrates structural encoding with contextual encoding to maximize information sharing in each layer.

Additionally, I refer you to this paper (https://arxiv.org/abs/2206.14017), which offers insights into various schema linking closer. The authors investigate explicit schema linking methods (rule-based and semantic mining) and take into account the hierarchical representations of databases.

22842219 commented 1 year ago

@22842219 Thank you for your interest in our work! We employ three approaches for schema linking in our study: 1) Explicit: a rule-based method, as you mentioned (e.g., school --> school_id); 2) Contextual semantic parsing, which elicits T5 to derive such alignments through semantic equivalence; and 3) Implicit: utilizing multi-hop reasoning to extract linkings, through Graphix training that integrates structural encoding with contextual encoding to maximize information sharing in each layer.

Additionally, I refer you to this paper (https://arxiv.org/abs/2206.14017), which offers insights into various schema linking closer. The authors investigate explicit schema linking methods (rule-based and semantic mining) and take into account the hierarchical representations of databases.

Hey, thanks a lot for your reply. But the heterogeneous graph is part of your input right? isn't the explicit mapping based on rules (I assume) between informative question tokens and related schema items contributing to the heterogeneous graph? This is the part not clear to me. Or you just established the obvious linking between question tokens like 'school' and schema items like 'school id' in your graph where the graph is part of your graphixT5 model input. and your model learns the linking between 'nation' and 'country' at the semantic level. Please do correct me if I get it wrong. Thanks for the paper recommendation and I will have a read. Thanks.

Best, Zea

22842219 commented 12 months ago

After reading the paper you recommended, I think I see what you meant. Thanks a lot.