Closed cezary4 closed 11 years ago
This is awesome man. Thanks for looking into all of this. I've filed some of these issues already on the github repo (check here: https://github.com/csvsoundsystem/fms_parser/issues?labels=parser&milestone=1&page=1&state=open). It would help if you could put any new ones all in there so we can make sure to adress all these.
i = 'brianabelson' @i http://www.twitter.com/brianabelson i.org http://brianabelson.org
On Sun, Jun 30, 2013 at 4:10 PM, cezary4 notifications@github.com wrote:
Hey guys,
While documenting the three tables that make up Table III, I noticed a few issues we'll need to tackle:
1) Tables III A and III B each contain two fields - "surtype" and "subtype" that sound alike but are very different. We should rename subtype to subitem to avoid confusion and rename surtype to subtype. Surtype tells you whether the transaction is an issue or redemption of U.S. debt so I would call that "subtype." And the current subtype is just an additional description given to some of the items where there's indenting: for example, "Bills" or "United States Savings Securites". So that column would work better if we named it "subitem." And because these names appear on the same indent as the actual items, we should parse them into the item column and then parse the indented items below them ("Regular Series" and "Cash Management Series" under "Bills") as subitems, since that's really what they are. Have a look at Table III A and you'll see what I mean. I think it will make working with that table much clearer for API users.
2) The field "surtype" is obsolete in Table III B; there are no "surtypes" similar to what exists in Table III A so we can just get rid of it. I am going to leave it in the documentation for now but if we're not using a field then there's no reason to have it in the table (unless it's a nightmare getting rid of it, @bdewilde https://github.com/bdewilde?)
3) The field "subtype" in Table III B is truncated. The field refers to the line item "Discount on New Issues" but only the words "Discount on" show up in the column. As with the naming confusion in #1https://github.com/csvsoundsystem/fms_parser/issues/1above, I would recommend renaming this field to "subitem" and parsing the fields indented underneath "Discount on New Issues" ("Bills" and "Bonds and Notes") as subitems and parsing "Discount on New Issues" as an item since it appears on the same indent as all the other items in the table.
4) The field "type" in Table III C is obsolete since there are no transaction types parsed from that table, so we don't need this column.
5) The "item" field in Table III C still has a lot of stray footnotes and other text that populates as items; we need to get rid of these items. Run this query on the table:
SELECT distinct(item) FROM t3c order by item desc
You will get the following list. The ones I've marked with an asterisk are stray fields we need to get rid of; we need to keep all the others:
limit to billion after Unamortized Discount represents the discount limit to billion after bonds ( amortization is calculated daily ) DAILY TREASURY STATEMENT PAGE: billion after adjustment on Treasury bills and zero coupon bonds ( amortization is calculated daily ) The Closing Balance Today for Debt Held by the Public decreased by million and the Unamortized Discount Total Public Debt Outstanding Subject to Limit Statutory Debt Limit Repurchase Agreements Other Debt Intragovernmental Holdings Intragovenmental Holdings increased by million Hope Bonds Guaranteed Debt of Government Agencies Federal Financing Bank Debt Held by the Public Act of operated to permanently increase the statutory debt limit to Act of operated to permanently increase the statutory debt
6) Table III C also needs an additional field such as a subitem or a flag that tells you whether the debt is subject to the debt limit or not. The whole point of that table is to separate out the debt into debt subject to the limit and exempt from the limit so if we don't flag that somehow we're missing an important piece of information. Maybe we could just create a field called "subject_to_limit" with a binary 1 or 0 and then make everything in that "Less" indent = 0 and everything else equal 1?
— Reply to this email directly or view it on GitHubhttps://github.com/csvsoundsystem/fms_parser/issues/114 .
Yeah issues 100 and 96 flag similar parser issues with Table III so I added a note up top saying they should be addressed together.
Errant footnotes more or less parsed out. Closed, optimistically!
6) should probably be filed as a separate issue for future improvement. Table III C also needs an additional field such as a subitem or a flag that tells you whether the debt is subject to the debt limit or not. The whole point of that table is to separate out the debt into debt subject to the limit and exempt from the limit so if we don't flag that somehow we're missing an important piece of information. Maybe we could just create a field called "subject_to_limit" with a binary 1 or 0 and then make everything in that "Less" indent = 0 and everything else equal 1?
Hey guys,
While documenting the three tables that make up Table III, I noticed a few issues we'll need to tackle:
CLOSE WHEN ALL ITEMS ARE CHECKED:
You will get the following list. The ones I've marked with an asterisk are stray fields we need to get rid of; we need to keep all the others: