moj-analytical-services / splink

Fast, accurate and scalable probabilistic data linkage with support for multiple SQL backends
https://moj-analytical-services.github.io/splink/
MIT License
1.27k stars 145 forks source link

Splink4: database_api contains imports that are not installed by default (e.g. pyspark) #1932

Closed RobinL closed 1 month ago

RobinL commented 7 months ago

image https://github.com/moj-analytical-services/splink/blob/4271722acd153d792f4ef9b9f68e09c537516959/splink/database_api.py#L14

ADBond commented 7 months ago

Think it would be probably good to break this module up eventually anyhow, maybe something like

database_api/ ├─ init.py # imports DuckDBAPI, SparkAPI etc ├─ database_api.py # core definition ├─ duckdb_api.py # DuckDBAPI ...

but probably want to think a bit about the whole module structure to see what will make the most sense

RobinL commented 1 month ago

Has been fixed in splink4