This PR updates the backend (loading to a database, creating source and target URIs, etc.) for geomancer. This is an attempt to make geomancer warehouse-agnostic, and so that it's easy to just switch-out between data warehouses (SQLite for testing, BigQuery and others for prod)
Motivation
I want to easily switch-out between different Data Warehouses. There should be a common API
to do that
Notable changes
Change some module names, common is now backend
A new base class Engine to create database connectors
A new class BigQueryEngine(Engine) that interacts with BigQuery
The cast() method simply calls the backend.connect() method to obtain the source, target, and engine SQLAlchemy primitives
Sample Usage
If I want to create a new engine (say, for SQLite):
from .base import Engine
class SQLiteEngine(Engine):
def __init__(self, db_path):
self.db_path = db_path
def load(self, df):
# Implement this method to load the pandas.DataFrame
# inside db.sqlite . This is a required method (raises NotImplemented if not done)
return table_uri
def _my_helper_function(self):
pass
You can check the implementation for the BigQueryEngine
Note I've tested this with our basic working example of loading CSVs etc.
This PR updates the backend (loading to a database, creating source and target URIs, etc.) for geomancer. This is an attempt to make geomancer warehouse-agnostic, and so that it's easy to just switch-out between data warehouses (SQLite for testing, BigQuery and others for prod)
Motivation
I want to easily switch-out between different Data Warehouses. There should be a common API to do that
Notable changes
common
is nowbackend
Engine
to create database connectorsBigQueryEngine(Engine)
that interacts with BigQuerycast()
method simply calls thebackend.connect()
method to obtain thesource
,target
, andengine
SQLAlchemy primitivesSample Usage
If I want to create a new engine (say, for SQLite):
You can check the implementation for the
BigQueryEngine
Note I've tested this with our basic working example of loading CSVs etc.