context-match command relies on the function find_main_entity_column to get the subject column. However, when there are more than one candidate for the subject column, it returns the first column (hard coded - 0), which, in many tables, may not be the entity column. When it's not (in my example), it makes the filtered data frame empty, which leads to the above error:
Stack trace:
Command: context-match
Error Message: Traceback (most recent call last):
File "/data/binhvu/table-linker/tl/cli/context-match.py", line 69, in run
obj = TableContextMatches(context_path=context_file_path, context_dict=None, input_path=input_file_path,
File "/data/binhvu/table-linker/tl/features/cell_context_matches.py", line 196, in __init__
self.initialize(input_df, context_dict, label_column)
File "/data/binhvu/table-linker/tl/features/cell_context_matches.py", line 329, in initialize
self.input_df = self.process(row_column_pairs, columns)
File "/data/binhvu/table-linker/tl/features/cell_context_matches.py", line 332, in process
context_scores, properties, similarities = self.compute_context_scores(n_context_columns, row_column_pairs)
File "/data/binhvu/table-linker/tl/features/cell_context_matches.py", line 347, in compute_context_scores
self.compute_property_scores(row_column_pairs, n_context_columns)
File "/data/binhvu/table-linker/tl/features/cell_context_matches.py", line 413, in compute_property_scores
self.write_relevant_properties(most_important_property_df)
UnboundLocalError: local variable 'most_important_property_df' referenced before assignment
context-match command relies on the function
find_main_entity_column
to get the subject column. However, when there are more than one candidate for the subject column, it returns the first column (hard coded -0
), which, in many tables, may not be the entity column. When it's not (in my example), it makes the filtered data frame empty, which leads to the above error:Stack trace:
Test file: test.csv & context.jl.txt Command: