heehehe / job-trend

[DE4E] 개발자 채용공고 데이터 추출 파이프라인 구축 및 응용 프로젝트
https://job-trend.streamlit.app
17 stars 2 forks source link

dbt 사용 #26

Closed heehehe closed 5 months ago

heehehe commented 7 months ago
python3 -m pip install dbt-postgres
dbt init [project_name]
@heehehe 트러블슈팅 기록 설정하고 dbt run 했을 때.. ``` 14:57:17 Running with dbt=1.7.3 14:57:17 Registered adapter: postgres=1.7.3 14:57:18 Found 2 models, 4 tests, 0 sources, 0 exposures, 0 metrics, 401 macros, 0 groups, 0 semantic models 14:57:18 14:57:18 14:57:18 Finished running in 0 hours 0 minutes and 0.02 seconds (0.02s). 14:57:18 Encountered an error: Database Error could not translate host name "host" to address: nodename nor servname provided, or not known ``` 설정을 변경해야 할 것 같음..🤔
heehehe commented 5 months ago

dbt DAG 구성

flowchart LR
jumpit --> job_jumpit
wanted --> job_wanted
jobplanet --> job_jobplanet
job_jumpit --> job
job_wanted --> job
job_jobplanet --> job
job --> content_jumpit
job --> content_wanted
job --> content_jobplanet

jumpit --> content_jumpit
wanted --> content_wanted
jobplanet --> content_jobplanet
content_jumpit --> content
content_wanted --> content
content_jobplanet --> content

jumpit --> company_jumpit
wanted --> company_wanted
jobplanet --> company_jobplanet
company_jumpit --> company
company_wanted --> company
company_jobplanet --> company
company --> content_jumpit
company --> content_wanted
company --> content_jobplanet

최종 데이터

부가 테이블

우선 name 기준으로 groupby 진행

ryuni-dev commented 5 months ago

@heehehe 좋은 것 같습니다! 따로 작업 중이신 브랜치가 있을까요~??

heehehe commented 5 months ago

@ryuni-dev 앗 어제 임시로 구상해보고 따로 공유를 못 드렸었네요ㅎㅎㅠㅠ https://github.com/heehehe/job-trend/pull/34 여기에 테스트해보고 있던 작업 올려뒀어요!

일단 dbt run으로 데이터 생성해서 bigquery에 올려지는 부분까지는 확인해서요, 최대한 오늘 회의 전까지 위에 짜놓은대로 만들어보려고 합니다 :) 혹시 따로 하고 계셨던 작업 있으시거나 다른 아이디어 있으시면 말씀주세요ㅎㅎ!!