iangow / wrds_pg

Code to get data from WRDS to PostgreSQL
46 stars 28 forks source link

Add RavenPack data #39

Closed iangow closed 6 years ago

iangow commented 6 years ago

@stewartc3 (@jamespkav) request:

The data I am working with is located here:

RavenPack News Analytics >> Equities >> PR Edition

Much appreciated if you could put this data on your server.

iangow commented 6 years ago

Looks like this is one table that you'd want. You can see that the file is pr_equities and the library is rpna. Let me try that one first.

Which others?

stewartc3 commented 6 years ago

I don't require any others. James may have some suggestions.

iangow commented 6 years ago

Here is code to get data:

library(dplyr, warn.conflicts = FALSE)
library(DBI)

Sys.setenv(PGHOST="10.101.13.99", PGDATABASE="crsp")
# Sys.setenv(PGUSER=, PGPASSWORD=)

pg <- dbConnect(RPostgreSQL::PostgreSQL())

pr_equities <- tbl(pg, sql("SELECT * FROM rpna.pr_equities"))

pr_equities
#> # Source:   SQL [?? x 48]
#> # Database: postgres 9.6.9 [igow@10.101.13.99:5432/crsp]
#>    rp_entity_id entity_type entity_name       position_name rp_position_id
#>    <chr>        <chr>       <chr>             <chr>         <chr>         
#>  1 0B4D07       COMP        Safeway Inc.      <NA>          <NA>          
#>  2 0CC745       COMP        Pfizer India Ltd. <NA>          <NA>          
#>  3 1A4321       COMP        Chordiant Softwa… <NA>          <NA>          
#>  4 1EBF8D       COMP        Merck & Co. Inc.  <NA>          <NA>          
#>  5 1EBF8D       COMP        Merck & Co. Inc.  <NA>          <NA>          
#>  6 228924       COMP        Albertsons Inc.   <NA>          <NA>          
#>  7 25102A       COMP        NASDAQ OMX Group… <NA>          <NA>          
#>  8 267718       COMP        Pfizer Inc.       <NA>          <NA>          
#>  9 2BC05E       COMP        Six Flags Entert… <NA>          <NA>          
#> 10 326E80       COMP        Verso Technologi… <NA>          <NA>          
#> # ... with more rows, and 43 more variables: country_code <chr>,
#> #   topic <chr>, group_ <chr>, type <chr>, sub_type <chr>, property <chr>,
#> #   evaluation_method <chr>, maturity <chr>, category <chr>,
#> #   ens_key <chr>, g_ens_key <chr>, event_similarity_key <chr>,
#> #   news_type <chr>, source <chr>, rp_story_id <chr>, product_key <chr>,
#> #   company <chr>, isin <chr>, relevance <dbl>, ess <dbl>, aes <dbl>,
#> #   aev <dbl>, ens <dbl>, ens_similarity_gap <dbl>, ens_elapsed <dbl>,
#> #   g_ens <dbl>, g_ens_similarity_gap <dbl>, g_ens_elapsed <dbl>,
#> #   rp_story_event_index <dbl>, rp_story_event_count <dbl>, css <dbl>,
#> #   nip <dbl>, peq <dbl>, bee <dbl>, bmq <dbl>, bam <dbl>, bca <dbl>,
#> #   ber <dbl>, anl_chg <dbl>, mcq <dbl>, rpna_date_utc <date>,
#> #   rpna_time_utc <chr>, timestamp_utc <chr>

Created on 2018-08-13 by the reprex package (v0.2.0).

iangow commented 6 years ago

Note that original field group is renamed group_ as the former is a keyword in SQL. Also table is actually multiple tables that are combined to form a "view" (unlikely to matter for practical purposes).