dataverbinders / statline-bq

Library to fetch CBS open datasets into parquet and optionally load into Google Cloud Storage and BigQuery
MIT License
0 stars 0 forks source link

Stream parquet write #60

Closed galamit86 closed 3 years ago

galamit86 commented 3 years ago

This PR:

Conceptually:

Techincally:

Not implemented

  1. v4 support for get_schema_cbs() - requesting the metadata url returns a 406 error, and should be examined (issue #59 opened). Currently circumventing by using the schema from the first page (considering these are 100k each, and a long format means less columns, the chances of error are much lower)
  2. Full translation of OData types to pyarrow types (issue #61 opened).