airbytehq / quickstarts

189 stars 47 forks source link

Pokémon Data Stack #64

Open ThaliaBarrera opened 1 year ago

ThaliaBarrera commented 1 year ago

Pokémon Analysis and Insights with Airbyte

Extract Pokémon, ability, or move data from the PokeAPI using Airbyte. Load the data into a warehouse for in-depth analysis on Pokémon attributes, popularity, or battle strategies.

You can use a tool like dbt for data transformation, and an orchestrator like Airflow or Dagster if needed.

How to get started:

CodeResolver commented 1 year ago

Hi!, I would like to work on these fine Pokémon.

ThaliaBarrera commented 1 year ago

Hi @CodeResolver! Sure, I have assigned it to you :). Let me know if you have any questions.

ThaliaBarrera commented 1 year ago

Hi @CodeResolver! Are you still working on this? Otherwise I may need to unassign. Let me know :)

CodeResolver commented 1 year ago

Hi @ThaliaBarrera yes I am, was having issues with Terraform and the ecommerce quickstart, keep getting a 405 error so I just moved to a MacOs to try again. Quick question, is Terraform required and wondering if maybe I should use the MongoDB to Mysql quickstart instead as an example for this integration?

CodeResolver commented 1 year ago

Hi again , also while researching found this video and he is actually going through the PokeAPI, should I implement what is shown there and expand on it ? thanks and sorry for all the questions : ) https://www.youtube.com/watch?v=kJ3hLoNfz_E

bishalbera commented 1 year ago

@CodeResolver yes indeed you need Terraform. and for ur 2nd quesstion as per I've seen the video is related to building a python CDK but to build a quickstart you'll need to follow the Ecommerce analytics one its well written and explained by @ThaliaBarrera . More info can be given by @ThaliaBarrera . Ty ^_^

ThaliaBarrera commented 1 year ago

Thanks for replying @bishalbera!

@CodeResolver maybe you can share your Terraform code and the error you're getting so we can further help

CodeResolver commented 1 year ago

Hi @ThaliaBarrera and @bishalbera , thanks for your help, replicating it now on Mac and ill send you the error if it shows up again.

bishalbera commented 1 year ago

@CodeResolver Sure :)

CodeResolver commented 1 year ago

Hi again, was finally able to setup terraform correctly, thanks for your help. While working with dbt I do run into these errors, do you happen to know why I'm getting these:

18:06:31 Running with dbt=1.6.6 18:06:32 Registered adapter: bigquery=1.6.8 18:06:32 Found 6 models, 3 sources, 0 exposures, 0 metrics, 394 macros, 0 groups, 0 semantic models 18:06:32
18:06:33 Concurrency: 1 threads (target='dev') 18:06:33
18:06:33 1 of 6 START sql view model transformed_data.stg_products ...................... [RUN] 18:06:34 1 of 6 ERROR creating sql view model transformed_data.stg_products ............. [ERROR in 0.68s] 18:06:34 2 of 6 START sql view model transformed_data.stg_purchases ..................... [RUN] 18:06:35 2 of 6 ERROR creating sql view model transformed_data.stg_purchases ............ [ERROR in 0.66s] 18:06:35 3 of 6 START sql view model transformed_data.stg_users ......................... [RUN] 18:06:35 3 of 6 ERROR creating sql view model transformed_data.stg_users ................ [ERROR in 0.65s] 18:06:35 4 of 6 SKIP relation transformed_data.product_popularity ....................... [SKIP] 18:06:35 5 of 6 SKIP relation transformed_data.purchase_patterns ........................ [SKIP] 18:06:35 6 of 6 SKIP relation transformed_data.user_demographics ........................ [SKIP] 18:06:35
18:06:35 Finished running 6 view models in 0 hours 0 minutes and 3.06 seconds (3.06s). 18:06:35
18:06:35 Completed with 3 errors and 0 warnings: 18:06:35
18:06:35 Runtime Error in model stg_products (models/staging/stg_products.sql) 404 Not found: Table bigquery-403520:raw_data.products was not found in location US

Thanks!

ThaliaBarrera commented 1 year ago

@CodeResolver does the raw_data.products actually exist? If so, is the location of the raw_data dataset "US"? Airbyte should have created the raw_data.products in BigQuery. You may need to run an Airbyte sync before attempting running the dbt models.

Note to self: Add that as a note in all quickstarts.

CodeResolver commented 1 year ago

Hi @ThaliaBarrera, just did a sync but still getting some issues, also I dont see the tables created in Bigquery but I do see the connections in Airbyte. Please check this gist with my current setup:

I put them all together for quicker reference: https://gist.github.com/CodeResolver/3c76736ef66b1c9bd74e8c894fcf575b

Also workspace_id was a bit tricky to find, please check if that is correct (got it from my local Airbyte instance url) and maybe consider adding a comment for that one to quickstarts.

Thanks!

bishalbera commented 1 year ago

@CodeResolver Hello from your code I can see that you are using faker source . So I assume that first you are experimenting with the Ecommerce analytics quickstart and havent yet started with your actual source which is poke API. Also I couldnt find the faker _source.yml and stg_products and other files code in your given code which are the key files to create the stg_products and other tables under transformed_data in BigQuery.

and for your workspace id yes you can find that in the local instances url its like workspaces/************/connections . there is a communication gap also as you are almost 10hr behind from Indian Standard Time otheer wise I could have helped more. Ty :)

CodeResolver commented 1 year ago

Hi @bishalbera thanks, yes I am trying to make this one work before I tackle the Pokeapi, I do see those 2 files you mentioned, I just added mine to the end of the gist, please take a look.

bishalbera commented 1 year ago

@CodeResolver ok ..It will be nice if you could come in dm like mail or twitter(if you are ok with it) as here it will be very long conversation and I dont know if that will be ok or not .

bishalbera commented 1 year ago

@CodeResolver ok now from your updated code I can see that you have set the faker_source and stg_products. now if you run dbt run it should run successfully if you have set the connection successfully

CodeResolver commented 1 year ago

Hey @bishalbera, sure we can do discord, here is a link to a server I just made for this: https://discord.gg/SMukNdGM

bishalbera commented 1 year ago

@CodeResolver ok