I have changed the if statement to use the new argument in the command similarity (remove_duplicates)
Homogeneity in the command output: We didn't have all the columns present in the multi-to-multi output for single-to-single or single-to-multi output :
Before :
Columns top_match_target and top_match_source present for muti-to-multi output
Column top_match_target and present for single-to-multi output
Neither column top_match_target or top_match_source present for single-to-single output
All these random present or not columns are a pain to automate if we want to use this command on a huge output of text-to-text comparison with a random number of comparisons to do.
So with this PR, I propose that we have a similar output for each possible output "type" so
After :
Columns top_match_target and top_match_source present for muti-to-multi output
Columns top_match_target and top_match_source present for single-to-multi output
Columns top_match_target and top_match_source present for single-to-single output
Added an option in the similarity view for removing or not duplicates in the data
Here is a PR for issue #5,
Before : Columns top_match_target and top_match_source present for muti-to-multi output Column top_match_target and present for single-to-multi output Neither column top_match_target or top_match_source present for single-to-single output
All these random present or not columns are a pain to automate if we want to use this command on a huge output of text-to-text comparison with a random number of comparisons to do.
So with this PR, I propose that we have a similar output for each possible output "type" so
After : Columns top_match_target and top_match_source present for muti-to-multi output Columns top_match_target and top_match_source present for single-to-multi output Columns top_match_target and top_match_source present for single-to-single output