iangow / se_features

Linguistic features derived from StreetEvents
1 stars 3 forks source link

Run LIWC 2015 #12

Closed iangow closed 4 years ago

iangow commented 5 years ago

Use table se_features.liwc_2015.

iangow commented 5 years ago

Waiting on iangow/honours_yvonne#2

iangow commented 4 years ago

@Yvonne-Han Multiplying counters seems feasible.

iangow commented 4 years ago

I'm not sure we need this as a separate issue. Perhaps it's the last issue to cover actually running the code when we're done.

Yvonne-Han commented 4 years ago

@iangow Started running liwc_run.py now (2020-07-04 21:14:00 AEST).

n_files: 477054
Yvonne-Han commented 4 years ago

Done.

library(dplyr, warn.conflicts = FALSE)
library(DBI)
library(reprex)

pg <- dbConnect(RPostgres::Postgres())
rs <- dbExecute(pg, "SET search_path TO se_features")

liwc_2015_output <- tbl(pg, "liwc_2015_output")

liwc_2015_output
#> # Source:   table<liwc_2015_output> [?? x 79]
#> # Database: postgres [yanzih1@10.101.13.99:5432/crsp]
#>    file_name last_update         speaker_name context section
#>    <chr>     <dttm>              <chr>        <chr>   <int64>
#>  1 12140978… 2018-11-30 03:52:38 Operator     qa      1      
#>  2 12140978… 2018-11-30 03:52:38 Gregory T. … qa      1      
#>  3 12140978… 2018-11-30 03:52:38 Operator     qa      1      
#>  4 12140978… 2018-11-30 03:52:38 Nathan M. T… qa      1      
#>  5 12140978… 2018-11-30 03:52:38 Richard Whi… qa      1      
#>  6 12140978… 2018-11-30 03:52:38 Nathan M. T… qa      1      
#>  7 12140978… 2018-11-30 03:52:38 Richard Whi… qa      1      
#>  8 12140978… 2018-11-30 03:52:38 Gregory T. … qa      1      
#>  9 12140978… 2018-11-30 03:52:38 Richard Whi… qa      1      
#> 10 12140978… 2018-11-30 03:52:38 Operator     qa      1      
#> # … with more rows, and 74 more variables: speaker_number <int64>,
#> #   Function <int64>, Pronoun <int64>, Ppron <int64>, I <int64>,
#> #   We <int64>, You <int64>, SheHe <int64>, They <int64>, Ipron <int64>,
#> #   Article <int64>, Prep <int64>, Auxverb <int64>, Power <int64>,
#> #   Adverb <int64>, Conj <int64>, Negate <int64>, Verb <int64>,
#> #   Adj <int64>, Compare <int64>, Interrog <int64>, Number <int64>,
#> #   Quant <int64>, Affect <int64>, Posemo <int64>, Negemo <int64>,
#> #   Anx <int64>, Anger <int64>, Sad <int64>, Social <int64>,
#> #   Family <int64>, Friend <int64>, Female <int64>, Male <int64>,
#> #   CogProc <int64>, Insight <int64>, Cause <int64>, Discrep <int64>,
#> #   Tentat <int64>, Certain <int64>, Differ <int64>, Percept <int64>,
#> #   See <int64>, Hear <int64>, Feel <int64>, Bio <int64>, Body <int64>,
#> #   Health <int64>, Sexual <int64>, Ingest <int64>, Drives <int64>,
#> #   Affiliation <int64>, Achieve <int64>, Reward <int64>, Risk <int64>,
#> #   FocusPast <int64>, FocusPresent <int64>, FocusFuture <int64>,
#> #   Relativ <int64>, Motion <int64>, Space <int64>, Time <int64>,
#> #   Work <int64>, Leisure <int64>, Home <int64>, Money <int64>,
#> #   Relig <int64>, Death <int64>, Informal <int64>, Swear <int64>,
#> #   Netspeak <int64>, Assent <int64>, Nonflu <int64>, Filler <int64>

liwc_2015_output %>% select(file_name) %>% distinct() %>% count()
#> # Source:   lazy query [?? x 1]
#> # Database: postgres [yanzih1@10.101.13.99:5432/crsp]
#>   n      
#>   <int64>
#> 1 477061

Created on 2020-07-07 by the reprex package (v0.3.0)