mccgr / edgar

Code to manage data related to SEC EDGAR
30 stars 15 forks source link

Add code for creating accession numbers #1

Closed iangow closed 7 years ago

iangow commented 7 years ago

From @iangow on April 9, 2017 16:28

Here is some code:

library(dplyr)
library(RPostgreSQL)

pg <- src_postgres()

dbGetQuery(pg$con, "SET work_mem='10GB'")
filings <- tbl(pg, sql("SELECT * FROM filings.filings"))
acc_no_regex <- "edgar/data/\\d+/(.*)\\.txt$"
acc_nos <- 
    filings %>%
    mutate(accession_no = regexp_replace(file_name, acc_no_regex, "\\1")) %>%
    mutate(cik = as.integer(cik)) %>%
    select(cik, accession_no) %>%
    compute(indexes = c("accession_no", "cik"))

Copied from original issue: iangow/filings#4

iangow commented 7 years ago

Code belongs in filings repository (for now) or new edgar repository.