Add features to facilitate linked clusters based on multiple models (i.e. dedupe models + link only model)

I have struggled to figure out how to simultaneously display intra-dataset edges and inter-dataset edges (and similarly for nodes, as they contain different information for deduping vs linking)

Even with a workaround to show and filter all edges within a cluster, the waterfall charts for those edges will depend on N+1 splink models for N datasets (N dedupes + 1 link). To facilitate this would require a new feature in this repo to accommodate multiple models in a cluster.

The logic on an edge by edge basis would be:

models = {"dataset1": model1, "dataset2": model2, "link": model3}
if source_dataset_l == source_dataset_r -> use models[source_dataset_l]
if source_dataset_l != source_dataset_r -> use models["link"]

moj-analytical-services / splink_cluster_studio

Add features to facilitate linked clusters based on multiple models (i.e. dedupe models + link only model) #3