internetofwater / nldi-services

Network Linked Data Index Navigation Web Services
https://waterdata.usgs.gov/blog/nldi-intro/
Creative Commons Zero v1.0 Universal
19 stars 15 forks source link

Define NLDI Feature model #386

Open gzt5142 opened 1 year ago

gzt5142 commented 1 year ago

Define a common Feature model for use across the project (crawler client, server, other tools). Can be a 'vanilla' dataclass, a pydantic model, or an ORM model. All sides of the workflow should agree on what a Feature looks like and how it is structured.

The python port of the crawler defines a SQLAlchemy ORM model (see https://github.com/gzt5142/nldi-crawler-py/blob/b87418874cb90cbf32ce8ea25bdbbddcef19c355/src/nldi_crawler/ingestor.py#L62

We should review this as a team to make sure that it matches the table schema exactly, and that it contains all of the features we will need going forward.

A common and importable model will require a minor refactor of the existing crawler port.

gzt5142 commented 1 year ago

I'm using a pydantic dataclass to represent a single ingested feature -- a row of the feature table.

See https://github.com/gzt5142/nldi-crawler-py/blob/7e0aa4c6230c983a4931561ab6322c2d68ece527/src/nldi_crawler/feature.py#L14

The sqlalchemy mechanism to bind this dataclass to the relevant table is modeled here: https://github.com/gzt5142/nldi-crawler-py/blob/7e0aa4c6230c983a4931561ab6322c2d68ece527/src/nldi_crawler/ingestor.py#L59

This method relies on the implied contract that the sql data access mechanism will connect to an appropriate database with a 'nldi_data' schema.