issues
search
dataverbinders
/
statline-bq
Library to fetch CBS open datasets into parquet and optionally load into Google Cloud Storage and BigQuery
MIT License
0
stars
0
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Add option to force v3 upload even if v4 exists
#84
galamit86
opened
2 years ago
0
new dataset upload bug fix
#83
galamit86
closed
2 years ago
1
Uploading new dataset (no version on GCP yet) fails
#82
galamit86
opened
3 years ago
1
missing values in 84286NED
#81
galamit86
opened
3 years ago
0
create col_desc json for v3 only
#80
galamit86
closed
3 years ago
0
Exclude ID field from test
#79
galamit86
closed
3 years ago
0
Load files during testing
#78
galamit86
closed
3 years ago
1
unnecessary commit
#77
galamit86
closed
3 years ago
0
Issue 72 testing
#76
galamit86
closed
3 years ago
1
Issue 72 implement testing
#75
galamit86
closed
3 years ago
0
Issue 73 checkpoints
#74
galamit86
closed
3 years ago
1
Add default option to only save parquet files locally, or just load to GCS
#73
dkapitan
closed
3 years ago
6
Implement basic unit testing
#72
dkapitan
closed
3 years ago
15
Logging
#71
galamit86
closed
3 years ago
4
Fix data properties
#70
galamit86
closed
3 years ago
3
Box config
#69
galamit86
closed
3 years ago
1
implement basic logging
#68
galamit86
closed
3 years ago
1
Add standard logging
#67
galamit86
closed
3 years ago
7
Add cli parameters
#66
galamit86
closed
3 years ago
0
fix skip dataset bug
#65
galamit86
closed
3 years ago
0
Value skewed during OData to PyArrow conversion
#64
galamit86
opened
3 years ago
0
skip conversion to parquet if url was empty
#63
galamit86
closed
3 years ago
0
Remove temp dir
#62
galamit86
closed
3 years ago
0
Complete mapping from OData typesd to pyarrow types
#61
galamit86
opened
3 years ago
0
Stream parquet write
#60
galamit86
closed
3 years ago
0
v4 support for `get_schema_cbs`
#59
galamit86
closed
2 years ago
1
Add prod envs
#58
galamit86
closed
3 years ago
0
remove "." from parquet field names
#57
galamit86
closed
3 years ago
0
update to pyarrow v3.0.0
#56
galamit86
closed
3 years ago
0
v4 parallelizes over multiplication of 100 records
#55
galamit86
closed
3 years ago
0
fix wrong dict key in get_main_table_shape
#54
galamit86
closed
3 years ago
0
dataset 70072ned is not uploaded properly
#53
galamit86
closed
3 years ago
1
remove auto capitalization
#52
galamit86
closed
3 years ago
0
remove auto capitalization
#51
galamit86
closed
3 years ago
0
use dask builtin methods to create parquet files
#50
galamit86
closed
3 years ago
1
Parallel fetch
#49
galamit86
closed
3 years ago
1
Refactor (and rename) `tables_to_parquet`
#48
galamit86
opened
3 years ago
0
add credentials parameter to all gcp-related utils
#47
galamit86
closed
3 years ago
0
Get files from GCS in utils.gcs_to_gbq
#46
galamit86
opened
3 years ago
0
Add cli parameters
#45
galamit86
closed
3 years ago
0
add click.option for a single dataset_id processing
#44
galamit86
closed
3 years ago
0
add odata_version handling for iv3
#43
galamit86
closed
3 years ago
0
Remove temp folders when done
#42
galamit86
closed
3 years ago
0
Lazy loading nl open data
#41
galamit86
closed
3 years ago
0
NL open data deployment
#40
galamit86
closed
3 years ago
0
Remove redundant functions from utils
#39
galamit86
opened
3 years ago
0
Lazy processing of datasets
#38
galamit86
closed
3 years ago
0
use dict.get to check dict values
#37
galamit86
closed
3 years ago
0
bug fix in main branch
#36
galamit86
closed
3 years ago
0
Acceptance testing
#35
dkapitan
closed
3 years ago
3
Next