Closed iangow closed 5 years ago
Addressing #30 will help here.
@iangow The Thomson Reuters Insiders Data Feed Manual has a pretty good summary of tables 1 and 2. Table 1 contains information on transactions of non-derivative stock, and Table 2 contains information on derivative stock. Having had a look at some of the xml files, it seems many of the form 4's have either a non-trivial <DerivativeTable>
(eg. this), a non-trivial <nonDerivativeTable>
(eg. this), or both (eg. this). In fact, in the latter example, there is a table of one of these types for each transaction reported under the filing (see the html for example).
Also, there seem to be some numbers which Thomson Reuters have scraped which map to numbers on the pages of the filing. For instance, the dcn
field of table 2 for this filing, which is 03817287
, is exactly equal the Film No.
listed on the html page, in the last section of details on the owner ('AMOROSO ALFRED J').
@iangow I've had a look at both tables for the example filing I gave above, with multiple transactions of both derivative and non-derivative stock. The Film No.
for this is 111142681
, and doing
SELECT * FROM tfn.table1 WHERE owner = 'AMOROSO ALFRED J' AND dcn = '111142681'
and
SELECT * FROM tfn.table2 WHERE owner = 'AMOROSO ALFRED J' AND dcn = '111142681'
one can see that there are 10 entries for this filing in table 1, and 5 entries in table 2 (sorry, not displaying the output since there are too many columns), corresponding to all the transactions in the filing. So each row of these tables corresponds to a single transaction from some filing, and thus there are generally multiple rows in each table for the one filing.
@iangow Also, curiously, in tables 1 and 2, there seem to be no columns with the accession numbers and cik's, although there are some fields cusip6
, cusip2
and cusipx
which correspond with CUSIP numbers.
@iangow I just had a go at trying to connect the CIK number from filings
to the CUSIP numbers in tfn.table2
for a particular filing, that I picked at random from filings
. If one looks at this filing, which I generated from the file_name
from filings
, if one takes the Film no.
which is 17537136
, and does
SELECT * FROM tfn.table2 WHERE dcn = '17537136'
one finds that the cusip6
is 108035
, the cusip2
is 10
, and the cusipx
is 6
. Putting this together yields a full CUSIP number 108035106
.
If one then takes the CIK number associated with the filing, which is 846617
(for BRIDGE BANCORP INC), and does
SELECT DISTINCT cusip, cik FROM edgar.cusip_cik WHERE cik = 846617
one sees that 108035106
is indeed one of the CUSIPS (the others were 10835106
(perhaps a mistake) and, curiously, a null).
Perhaps there is scope to do joins partially between these datasets through edgar.cusip_cik
, though I am aware we obtained that table through the SC 13D
forms, which are only filed when someone owns more than 5% of a firm. Also, there was a relatively recent filing which was a sale of derivative stock, but when I used the filming number for dcn, it didn't show up in tfn.table2
as it should have. I assume that could be because we haven't updated table2
on our side, though I'm curious to know how often tables 1 and 2 are updated.
@bdcallen
Don't worry about matching CUSIPs with CIKs for now. It's enough to know that the tfn
tables do not include CIKs, as scraping data on CIKs is something of primary interest with these forms. I don't believe anything we have in the edgar
tables maps to "film number" (and I believe we don't have any ID fields in common with the tfn
data … this is a pain). I think we'd have to scrape film numbers if we wanted a solid link between tfn
and edgar
tables.
Data in
tfn.table1
andtfn.table2
.