anomaly / gallagher

The missing toolkit for extending Gallagher Command Centre, featuring a CLI, SQL interface, TUI and a Python idiomatic SDK
https://anomaly.github.io/gallagher/
MIT License
11 stars 2 forks source link

Provide a SQL (preferably a SQLAlchemy `dialect`) interface to query the REST API #31

Open devraj opened 5 months ago

devraj commented 5 months ago

As our ambition around the library grows, the sync feature #9 looks to be a lot more useful than I previously imagined. One of my thoughts was to treat this as if they were two databases sources and use a layer like SQLAlchemy to keep them in sync.

Sync is a difficult problem (see also SQL Sync for Schema with SQLAlchemy python-sync-db) at the best of times. My initial research lead me down the path of writing a SQLAlchemy dialect primarily thinking about if it's possible to wrap a REST service as a SQL source.

I found a number of articles and discussions which leads to believe this is a possible way forward:

Digging through the list of external dialects I found betodealmeida / gsheets-db-api which provides that we can converse with a REST API via a SQLAlchemy dialect. This project links to Shillelagh which is a library is an implementation of the Python DB API 2.0 based on SQLite (using the APSW library)

Other resources:

The requirement is thus to research the above resources and outline the possibility of using a SQLALchemy dialect to write the sync module with the view of being able to use the REST client to communicate with the Gallagher proxy.

Why SQL?

One of the major questions around this is Why SQL?, we could head down the route of using an object database. The question to consider is if the end user / customer would benefit from the data being available in an object database?

With a foundation of writing to SQL backend we also stand the advantage of writing to / reading from corporate database backends.

devraj commented 5 months ago

From my initial reading, the way we should approach this is to write a customer adapter for shillelagh which also allows for a custom sqlalchemy dialect.

The references the following

Sharks have been around for a long time. They’re older than trees and the rings of Saturn, actually! The reason they haven’t changed that much in hundreds of millions of years is because they’re really good at what they do.

SQL has been around for some 50 years for the same reason: it’s really good at what it does.

as their philosophy for Why SQL.

betodealmeida commented 4 months ago

This is awesome! Thanks for the heads up, and let me know if you have any questions, I'm happy to help.

devraj commented 4 months ago

Quick note for anyone else working with shillelagh and attempting to debug their adapter, my general strategy for development is:

I figured it would be easier to this before I get into building a SQLAlchemy dialect #9

At first attempt it all looked well and when I ran:

🍀> SELECT * FROM "https://commandcentre-api-au.security.gallagher.cloud/api/cardholders";
Unsupported table: https://commandcentre-api-au.security.gallagher.cloud/api/cardholders

I got the unsupported table error and could not figure out what was wrong. For obvious reasons the console suppresses log messages, I decided to enable logging and set it to DEBUG, this was added to the top of the file containing my adapter.

import logging
logging.basicConfig(level=logging.DEBUG, format='%(asctime)s %(levelname)s %(message)s')

upon restarting the console, I saw the following, and found that my adapter wasn't registering properly:

🍀>                                                                                                                               
GoodBye!
(shillelagh) ➜  ~ shillelagh 
2024-05-15 07:51:36,627 ERROR test
2024-05-15 07:51:36,628 DEBUG No module named 'gallagher.ext.shillelagh.GallagherCommandCentreAPI'
2024-05-15 07:51:36,641 DEBUG Using selector: KqueueSelector

Upon inspecting my pyproject.toml I found that I had a syntax error where I was referring to the module as gallagher.ext.shillelagh.GallagherCommandCentreAPI instead of gallagher.ext.shillelagh:GallagherCommandCentreAPI, the fixed version is:

[tool.poetry.plugins."shillelagh.adapter"]
gacc = "gallagher.ext.shillelagh:GallagherCommandCentreAPI"

And given that it returns the required rows of data, the logs should make it easier to debug the adapter.

TypeError: missing a required argument: 'api_key'
(shillelagh) ➜  ~ shillelagh 
2024-05-15 07:55:03,649 ERROR test
2024-05-15 07:55:03,666 DEBUG Using selector: KqueueSelector
🍀> SELECT * FROM "https://commandcentre-api-au.security.gallagher.cloud/api/cardholders";
2024-05-15 07:55:09,149 DEBUG Instantiating adapter with deserialized arguments: ['https://commandcentre-api-au.security.gallagher.cloud/api/cardholders', 'api_key']
authorised    first_name      id  last_name
------------  ------------  ----  -----------
True          Dev              1  Mukherjee
(1 row in 0.05s)

2024-05-15 07:55:09,150 DEBUG Using selector: KqueueSelector

Make sure that this isn't enabled when the adapter ships, even better find a way to dynamically configure the logger

devraj commented 3 months ago

utils.py in the dto package currently uses the py3.9 __annotation__ property to access the annotations:

    @classmethod
    def _accumulated_annotations(cls) -> dict:
        """Return a dictionary of all annotations

        This method is used to return a dictionary of all the
        annotations from itself and it's parent classes.

        It is intended for use by the shillelagh extension
        """
        annotations = cls.__annotations__.copy() # TODO: should we  make copies?
        for base in cls.__bases__:
            if issubclass(base, BaseModel):
                # Copy annotations from classes that are pydantic models
                # they are the only things that form part of the response
                annotations.update(base.__annotations__)
        return annotations.items()

This recommendation has changes in py3.10 see this article.

Looking at the inspect package get_annotations returns the annotations of the class itself, where as typing.get_type_hints returns typing definitions from itself and parent classes.

>>> import typing
>>> typing.get_type_honest(CardholderSummary)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: module 'typing' has no attribute 'get_type_honest'. Did you mean: 'get_type_hints'?
>>> typing.get_type_hints(CardholderSummary)
{'href': <class 'pydantic_core._pydantic_core.Url'>, 'id': <class 'str'>, '_good_known_since': typing.Optional[datetime.datetime], 'first_name': <class 'str'>, 'last_name': <class 'str'>, 'short_name': typing.Optional[str], 'description': typing.Optional[str], 'authorised': <class 'bool'>}

We should refactor this method to use a py3.10+ solution.