oceanprotocol / pdr-backend

Instructions & code to run predictoors, traders, more.
Apache License 2.0
33 stars 24 forks source link

[Lake][ETL] Insert GQL data into PersistentStore #682

Closed idiom-bytes closed 8 months ago

idiom-bytes commented 9 months ago

Motivation

Update gql_data_factory to integrate PersistentStore and push data to raw tables.

Persist SLA Only new records should be fetched & inserted into db.

  1. get last timestamp from db.table
  2. fetch subgraph from_last_timestamp to_now
  3. insert subgraph new_records into db.table image

DoD:

kdetry commented 8 months ago

Although Pydantic is a great tool, it's not always necessary since we already have Polars data schemas that validate the data structure. Using Pydantic now would result in double validation.

idiom-bytes commented 8 months ago

The whole etl pipeline is now implemented using CSV + DuckDB.

It has been verified and merged into PR #685, which is a large Epic that includes many other updates to improve Lake/ETL/etc... https://github.com/oceanprotocol/pdr-backend/issues/685