gzt5142 / nldi-crawler-py

Network Linked Data Index Crawler
https://labs.waterdata.usgs.gov/about-nldi/
Other
1 stars 1 forks source link

SQL object mapping #3

Closed gzt5142 closed 1 year ago

gzt5142 commented 1 year ago

Moved to v2.0 for sqlalchemy, because the object mapper is much better and easier to configure. The ORM will auto-reflect the shema of an existing table into a python object (where cols in the table map to properties of the object) with minimal fuss.

sqlalchemy v2 is recently released and is still in beta. That's a bit of a concern for building enterprise software (beta), but I think it's the way of the future -- and it is certainly easier to work with than v1.x.

gzt5142 commented 1 year ago

Success in using the ORM to ingest and process the crawler_sources table.

Still trying to figure out the best way to reflect the other tables in nldi-db.

gzt5142 commented 1 year ago

While it is fairly easy to automatically reflect existing tables to a python object, doing so at run-time poses some challenges. We are hard-coding some column names on the assumption that they don't change. We also can't reflect the existing table until a valid db connection is established (with authentication, etc).

Refactoring to a "declarative" object mapping, which will simplify the code a fair bit and will also eliminate some of the runtime problems mentioned here.

gzt5142 commented 1 year ago

SQL Alchemy does not, by itself, handle spatial data types. And so we've got to add on geoalchemy2.

This is necessary so that we can create tables with arbitrary names in the database rather than merely reflecting the existing tables. Ingested features are put into a table specific to the data source. As new datasources are added, the tables supporting those sources need to be created correctly.