if exact_match(n_gram_list, col):
set_q_relation(q_col_match, i, n, col_id, "CEM")
for tab_id, tab in tab_id2list.items():
if exact_match(n_gram_list, tab):
set_q_relation(q_tab_match, i, n, tab_id, "TEM")
# partial match case
for col_id, col in col_id2list.items():
if partial_match(n_gram_list, col):
set_q_relation(q_col_match, i, n, col_id, "CPM", force=False)
for tab_id, tab in tab_id2list.items():
if partial_match(n_gram_list, tab):
# 这里应该错了,原本是TEM,但其实应该是TPM
# set_q_relation(q_tab_match, i, n, tab_id, "TEM", force=False)
set_q_relation(q_tab_match, i, n, tab_id, "TPM", force=False)`
在linking-units中的compute_schema_linking函数,当它给question和table部分匹配打标签时,打了TEM标签,但是其实应该是TPM吧。 ` # exact match case for col_id, col in col_id2list.items():
如果n-gram-list拼起来之后和col名字相同, 则加入dict{i,col_id:CEM},。。。{i+n,col_id:CEM}
还有一个问题是,在exact_match和partial_match中,会把n-gram list拼接起来,但是这样子不会导致大量的重复吗?比如原本是abc。2-gram的拼接变成abbc。还是我理解有问题。。。望大佬解惑
还有另一个问题,在linking_unit中class Relations(object)的倒数,merge为ture的时候是不是写的有问题。
for i in range(-qq_max_dist, qq_max_dist + 1): self.relation_ids['cc_dist', i] = self.relation_ids['qq_dist', i] self.relation_ids['tt_dist', i] = self.relation_ids['tt_dist', i]
感觉这个for循环最后一句完全没有意义