radchenkoam / OTUS-de-2020-11

0 stars 0 forks source link

Проектная работа #10

Closed radchenkoam closed 3 years ago

radchenkoam commented 3 years ago

Занятие 29
Защита проектных работ
Проектная работа

Изучение набора данных в JupiterLab (PySpark),
последующая загрузка и построение витрин в СУБД Vertica
с помощью Data Build Tool, визуализация в Redash.


radchenkoam commented 3 years ago

1. Изучение датасета в JupiterLab


radchenkoam commented 3 years ago

2. Загрузка и построение витрин в СУБД Vertica с помощью Data Build Tool


Found 6 models, 12 tests, 0 snapshots, 0 analyses, 141 macros, 0 operations, 2 seed files, 0 sources

15:59:31 | Concurrency: 1 threads (target='dev') 15:59:31 | 15:59:31 | 1 of 2 START seed file dbt.crime..................................... [RUN] 16:00:08 | 1 of 2 OK loaded seed file dbt.crime................................. [-1 in 37.02s] 16:00:08 | 2 of 2 START seed file dbt.offense_codes............................. [RUN] 16:00:08 | 2 of 2 OK loaded seed file dbt.offense_codes......................... [-1 in 0.06s] 16:00:08 | 16:00:08 | Finished running 2 seeds in 37.20s.

Completed successfully

Done. PASS=2 WARN=0 ERROR=0 SKIP=0 TOTAL=2

radchenkoam commented 3 years ago

Found 6 models, 12 tests, 0 snapshots, 0 analyses, 141 macros, 0 operations, 2 seed files, 0 sources

16:07:08 | Concurrency: 1 threads (target='dev') 16:07:08 | 16:07:08 | 1 of 6 START view model dbt.stg_crime................................ [RUN] 16:07:08 | 1 of 6 OK created view model dbt.stg_crime........................... [-1 in 0.08s] 16:07:08 | 2 of 6 START view model dbt.stg_offense_codes........................ [RUN] 16:07:08 | 2 of 6 OK created view model dbt.stg_offense_codes................... [-1 in 0.03s] 16:07:08 | 3 of 6 START table model dbt.crimes.................................. [RUN] 16:07:09 | 3 of 6 OK created table model dbt.crimes............................. [-1 in 1.21s] 16:07:09 | 4 of 6 START table model dbt.mrt_offense_all_count................... [RUN] 16:07:10 | 4 of 6 OK created table model dbt.mrt_offense_all_count.............. [-1 in 0.17s] 16:07:10 | 5 of 6 START table model dbt.mrt_offense_by_year_count............... [RUN] 16:07:10 | 5 of 6 OK created table model dbt.mrt_offense_by_year_count.......... [-1 in 0.17s] 16:07:10 | 6 of 6 START table model dbt.mrt_offense_by_year_month_count......... [RUN] 16:07:10 | 6 of 6 OK created table model dbt.mrt_offense_by_year_month_count.... [-1 in 0.18s] 16:07:10 | 16:07:10 | Finished running 2 view models, 4 table models in 1.99s.

Completed successfully

Done. PASS=6 WARN=0 ERROR=0 SKIP=0 TOTAL=6

- проверил
```bash
dbadmin=> \c boston_crimes
You are now connected to database "boston_crimes" as user "dbadmin".
boston_crimes=> \dtv+
                            List of tables
 Schema |              Name               | Kind  |  Owner  | Comment 
--------+---------------------------------+-------+---------+---------
 dbt    | crime                           | table | dbadmin | 
 dbt    | crimes                          | table | dbadmin | 
 dbt    | mrt_offense_all_count           | table | dbadmin | 
 dbt    | mrt_offense_by_year_count       | table | dbadmin | 
 dbt    | mrt_offense_by_year_month_count | table | dbadmin | 
 dbt    | offense_codes                   | table | dbadmin | 
 dbt    | seed_rejects                    | table | dbadmin | 
 dbt    | stg_crime                       | view  | dbadmin | 
 dbt    | stg_offense_codes               | view  | dbadmin | 

boston_crimes=> select * from dbt.crime limit 5;
-[ RECORD 1 ]-------+------------------------------------
INCIDENT_NUMBER     | 142052550
OFFENSE_CODE        | 3125
OFFENSE_CODE_GROUP  | Warrant Arrests
OFFENSE_DESCRIPTION | WARRANT ARREST
DISTRICT            | D4
REPORTING_AREA      | 903
SHOOTING            | 
OCCURRED_ON_DATE    | 2015-06-22 00:12:00
YEAR                | 2015
MONTH               | 6
DAY_OF_WEEK         | Monday
HOUR                | 0
UCR_PART            | Part Three
STREET              | WASHINGTON ST
Lat                 | 42.33383935
Long                | -71.08029038
Location            | (42.33383935, -71.08029038)
-[ RECORD 2 ]-------+------------------------------------
INCIDENT_NUMBER     | I010370257-00
OFFENSE_CODE        | 3125
OFFENSE_CODE_GROUP  | Warrant Arrests
OFFENSE_DESCRIPTION | WARRANT ARREST
DISTRICT            | E13
REPORTING_AREA      | 569
SHOOTING            | 
OCCURRED_ON_DATE    | 2016-05-31 19:35:00
YEAR                | 2016
MONTH               | 5
DAY_OF_WEEK         | Tuesday
HOUR                | 19
UCR_PART            | Part Three
STREET              | NEW WASHINGTON ST
Lat                 | 42.30233307
Long                | -71.11156487
Location            | (42.30233307, -71.11156487)
...

👍🏻 всё ок

radchenkoam commented 3 years ago

3. Визуализация в Readsh