DalgoT4D / dbt_lahi

0 stars 2 forks source link

Flatten the raw data using dbt macro #1

Closed siddhant3030 closed 1 year ago

siddhant3030 commented 1 year ago

I'm using a dbt macro to flatten the Lahi data. There are multiple state data in the DB. check the lahi_source.yml file. Our goal is to correct the dbt macro to fix all the column names which we're pulling from raw data.

this is what the expectation is


 replace('11Boys', 'boys11') | 
               replace('9Boys', 'boys9') | 
               replace('12Girls', 'Girls12') | 
               replace('10 Total', 'Total 10') | 
               replace('10Girls', 'Girls10') |
               replace('11Girls', 'Girls 11')  |
               replace('10Boys', 'Boys 10')  |
               replace('9Girls', 'Girls 9')  |
               replace('12 Total', 'Total 12')  |
               replace('/', '_') | replace('-', '_') |
               replace(' ', '_') | replace('.', '_') 

now if this is not possible then add the prefix. There will be some other issue as well. but we should be able to add more logic into this.

fatchat commented 1 year ago

this seems like a good use-case for doing it with python

siddhant3030 commented 1 year ago

@fatchat you can write python in macro. That's what I'm doing

siddhant3030 commented 1 year ago

here's the macro link https://github.com/DevDataPlatform/dbt_lahi/blob/main/macros/flatten_json.sql