Open gzt5142 opened 1 year ago
I'm using a pydantic
dataclass to represent a single ingested feature -- a row of the feature table.
The sqlalchemy mechanism to bind this dataclass to the relevant table is modeled here: https://github.com/gzt5142/nldi-crawler-py/blob/7e0aa4c6230c983a4931561ab6322c2d68ece527/src/nldi_crawler/ingestor.py#L59
This method relies on the implied contract that the sql data access mechanism will connect to an appropriate database with a 'nldi_data' schema.
Define a common
Feature
model for use across the project (crawler client, server, other tools). Can be a 'vanilla'dataclass
, a pydantic model, or an ORM model. All sides of the workflow should agree on what aFeature
looks like and how it is structured.The python port of the crawler defines a SQLAlchemy ORM model (see https://github.com/gzt5142/nldi-crawler-py/blob/b87418874cb90cbf32ce8ea25bdbbddcef19c355/src/nldi_crawler/ingestor.py#L62
We should review this as a team to make sure that it matches the table schema exactly, and that it contains all of the features we will need going forward.
A common and importable model will require a minor refactor of the existing crawler port.