We plan to allow the user to do some forms of exploratory analysis without needing to create a linker (like profile_columns and various types of blocking analysis e.g. #2136 )
But this means that __splink__df_concat needs to be computed without the linker.
At the moment, this requires a lot of code that's confusing to read and will be repetitive:
Issue can be addressed by removing the need for a linker to compute __splink__df_concat, giving us reusable code that can be used for profile_columns, blocking analysis etc.
We want vertically_concatenate_sql to be modified so it doesn't take a linker as an argument, but the functions like vertically_concatenate.compute_df_concat shuld still take the linker as an argument
We plan to allow the user to do some forms of exploratory analysis without needing to create a linker (like
profile_columns
and various types of blocking analysis e.g. #2136 )But this means that
__splink__df_concat
needs to be computed without the linker.At the moment, this requires a lot of code that's confusing to read and will be repetitive:
https://github.com/moj-analytical-services/splink/blob/66ec54f0f114cf3eda20ea7fe9e05ccfff2c584c/splink/profile_data.py#L237-L267
Issue can be addressed by removing the need for a
linker
to compute__splink__df_concat
, giving us reusable code that can be used forprofile_columns
, blocking analysis etc.