moj-analytical-services / pydbtools

Python version of dbtools
https://moj-analytical-services.github.io/pydbtools/
11 stars 2 forks source link

Be able to add parquet files to temp_table #40

Open isichei opened 3 years ago

isichei commented 3 years ago

Something like:

import pydbtools as pydb

pydb.add_table_to_temp_db(s3_path="s3://path-to-parquet-file/", table_name="table1")

Code already doing something similar

This might be possible already (think this should exist but adding an example for short term if you want to do this with v3 release). Need to check if this works but I think it should??

import awswrangler as wr
import pydbtools as pydb

user_id, out_path = pydb.get_user_id_and_table_dir()
temp_db_name = pydb.get_database_name_from_userid(user_id)

columns_types, partitions_types, partitions_values = wr.s3.store_parquet_metadata(
    path="s3://path-to-parquet-file/",
    database=temp_db_name,
    table="table1",
    dataset=True
)