csingley / ofxtools

Python OFX Library
Other
301 stars 68 forks source link

Error Parsing INV401KBAL #95

Closed aclindsa closed 3 years ago

aclindsa commented 3 years ago

I am getting an error while parsing OFX from Fidelity:

ofxtools.models.base.OFXSpecError: INVSTMTRS SubElements out of order: ['TOTAL', 'BALLIST']

I initially assumed it was a Fidelity issue and went digging around in the OFX file to see if I could fix it up with a little sed and be on my merry way. Interestingly, the error message remained unchanged when I removed the BALLIST section from INVSTMTRS. However, when I removed the INV401KBAL section (which contains TOTAL and BALLIST elements), the message disappeared. I believe the elements inside INV401KBAL are in the correct order according to the SPEC (looking at version 2.0.3, section "13.9.2.9 401(k) Balances "). So, I think that means this is a parser bug.

I'm willing to help debug further; see below for more information on how to reproduce what I'm seeing.

Here is some simplified/anonymized OFX which demonstrates the issue (referred to as example.ofx in code below):

<?xml version="1.0" encoding="UTF-8" standalone="no"?><?OFX OFXHEADER="200" VERSION="203" SECURITY="NONE" OLDFILEUID="NONE" NEWFILEUID="NONE"?><OFX><SIGNONMSGSRSV1><SONRS><STATUS><CODE>0</CODE><SEVERITY>INFO</SEVERITY><MESSAGE>SUCCESS</MESSAGE></STATUS><DTSERVER>20201109095221.979[-5:EST]</DTSERVER><LANGUAGE>ENG</LANGUAGE><FI><ORG>fidelity.com</ORG><FID>7776</FID></FI></SONRS></SIGNONMSGSRSV1><INVSTMTMSGSRSV1><INVSTMTTRNRS><TRNUID>00000000-0000-0000-0000-000000000000</TRNUID><STATUS><CODE>0</CODE><SEVERITY>INFO</SEVERITY><MESSAGE>SUCCESS</MESSAGE></STATUS><INVSTMTRS><DTASOF>20201107033007.000[-5:EST]</DTASOF><CURDEF>USD</CURDEF><INVACCTFROM><BROKERID>fidelity.com</BROKERID><ACCTID>123456789</ACCTID></INVACCTFROM><INVTRANLIST><DTSTART>20200702010000.000[-5:EDT]</DTSTART><DTEND>20200902010000.000[-5:EDT]</DTEND><INCOME>    <INVTRAN><FITID>00000000000000000000000</FITID><DTTRADE>20200831000000.000[-5:EDT]</DTTRADE><MEMO>DIVIDEND RECEIVED</MEMO></INVTRAN><SECID><UNIQUEID>316146356</UNIQUEID><UNIQUEIDTYPE>CUSIP</UNIQUEIDTYPE></SECID><INCOMETYPE>DIV</INCOMETYPE><TOTAL>+00000000000001.0000</TOTAL><SUBACCTSEC>CASH</SUBACCTSEC><SUBACCTFUND>CASH</SUBACCTFUND><CURRENCY><CURRATE>1.00</CURRATE><CURSYM>USD</CURSYM></CURRENCY>    </INCOME></INVTRANLIST><INVPOSLIST><POSMF><INVPOS><SECID><UNIQUEID>316146356</UNIQUEID><UNIQUEIDTYPE>CUSIP</UNIQUEIDTYPE></SECID><HELDINACCT>CASH</HELDINACCT><POSTYPE>LONG</POSTYPE><UNITS>1.00000</UNITS><UNITPRICE>1.0000000</UNITPRICE><MKTVAL>+00000000001.00</MKTVAL><DTPRICEASOF>20201107033007.000[-5:EST]</DTPRICEASOF><CURRENCY><CURRATE>1.0</CURRATE><CURSYM>USD</CURSYM></CURRENCY></INVPOS></POSMF></INVPOSLIST><INV401K><EMPLOYERNAME>Unavailable</EMPLOYERNAME></INV401K><INV401KBAL><TOTAL></TOTAL><BALLIST><BAL><NAME>Networth</NAME><DESC>The net market value of all long and short positions in the account</DESC><BALTYPE>DOLLAR</BALTYPE><VALUE>1.00</VALUE><CURRENCY><CURRATE>1.000</CURRATE><CURSYM>USD</CURSYM></CURRENCY></BAL></BALLIST></INV401KBAL></INVSTMTRS></INVSTMTTRNRS></INVSTMTMSGSRSV1><SECLISTMSGSRSV1><SECLIST><MFINFO><SECINFO><SECID><UNIQUEID>316146356</UNIQUEID><UNIQUEIDTYPE>CUSIP</UNIQUEIDTYPE></SECID><SECNAME>FIDELITY U.S. BOND INDEX FUND</SECNAME><TICKER>FXNAX</TICKER><UNITPRICE>12.4300000</UNITPRICE><DTASOF>20201107033007.000[-5:EST]</DTASOF><CURRENCY><CURRATE>1.000</CURRATE><CURSYM>USD</CURSYM>  </CURRENCY></SECINFO><MFTYPE>OTHER</MFTYPE><DTYIELDASOF>20201107033007.000[-5:EST]</DTYIELDASOF></MFINFO></SECLIST></SECLISTMSGSRSV1></OFX>

Code to reproduce:

#!/usr/bin/env python

from ofxtools.Parser import OFXTree
parser = OFXTree()
parser.parse("example.ofx")
ofx = parser.convert()

Full error message:

Traceback (most recent call last):
  File "./example.py", line 6, in <module>
    ofx = parser.convert()
  File "/home/aclindsa/.local/lib/python3.8/site-packages/ofxtools/Parser.py", line 132, in convert
    instance = Aggregate.from_etree(self._root)
  File "/home/aclindsa/.local/lib/python3.8/site-packages/ofxtools/models/base.py", line 200, in from_etree
    instance = SubClass._convert(elem)
  File "/home/aclindsa/.local/lib/python3.8/site-packages/ofxtools/models/base.py", line 284, in _convert
    args, kwargs = functools.reduce(update_args, elem, initial)[:2]
  File "/home/aclindsa/.local/lib/python3.8/site-packages/ofxtools/models/base.py", line 268, in update_args
    value = Aggregate.from_etree(elem)
  File "/home/aclindsa/.local/lib/python3.8/site-packages/ofxtools/models/base.py", line 200, in from_etree
    instance = SubClass._convert(elem)
  File "/home/aclindsa/.local/lib/python3.8/site-packages/ofxtools/models/base.py", line 284, in _convert
    args, kwargs = functools.reduce(update_args, elem, initial)[:2]
  File "/home/aclindsa/.local/lib/python3.8/site-packages/ofxtools/models/base.py", line 268, in update_args
    value = Aggregate.from_etree(elem)
  File "/home/aclindsa/.local/lib/python3.8/site-packages/ofxtools/models/base.py", line 200, in from_etree
    instance = SubClass._convert(elem)
  File "/home/aclindsa/.local/lib/python3.8/site-packages/ofxtools/models/base.py", line 284, in _convert
    args, kwargs = functools.reduce(update_args, elem, initial)[:2]
  File "/home/aclindsa/.local/lib/python3.8/site-packages/ofxtools/models/base.py", line 268, in update_args
    value = Aggregate.from_etree(elem)
  File "/home/aclindsa/.local/lib/python3.8/site-packages/ofxtools/models/base.py", line 200, in from_etree
    instance = SubClass._convert(elem)
  File "/home/aclindsa/.local/lib/python3.8/site-packages/ofxtools/models/base.py", line 284, in _convert
    args, kwargs = functools.reduce(update_args, elem, initial)[:2]
  File "/home/aclindsa/.local/lib/python3.8/site-packages/ofxtools/models/base.py", line 257, in update_args
    raise OFXSpecError(f"{clsnm} SubElements out of order: {subels}")
ofxtools.models.base.OFXSpecError: INVSTMTRS SubElements out of order: ['TOTAL', 'BALLIST']

I'm using version 0.8.22 from pip.

aclindsa commented 3 years ago

I forgot to note my hunch - I see in the definition of class INV401KBAL(Aggregate): in invest/stmt.py that it contains the total and ballist elements in the same order as the spec. That combined with the fact that 'INVSTMTRS' was mentioned in the error message instead of 'INV401KBAL' makes me wonder if this is a case of the parser incorrectly finding elements that are nested more deeply than it should be currently considering for the element being parsed.

@csingley What's the best way to confirm/deny this theory (assuming it's even reasonable)?

csingley commented 3 years ago

Hi, thanks for the nice bug report.

Actually there are two problems here:

  1. The class definition for ofxtools.models.invest.INVSTMTRS has the positions of inv401k and inv401kbal out of order w/r/t the spec (which Fidelity is properly following), which is what's throwing the error.
  2. The error message itself is messed up; it (incorrectly) refers to the subelements rather than the actual element which is detected out of order.

Sorry about that. This is easily fixed; I'll see to it.

aclindsa commented 3 years ago

Awesome, glad it's not a difficult bug to fix - thanks!

And I see I should've stopped while I was ahead instead of trying to guess at the cause! :stuck_out_tongue:

csingley commented 3 years ago

Well your hunch was right on the money at least for the bogus error message, which seems to have been blindly copy&pasted from an older version of the Aggregate.from_etree() logic that used recursion instead of functools.reduce().

I just want to get unit tests for this, then I'll check in a fix.

aclindsa commented 3 years ago

I just want to get unit tests for this, then I'll check in a fix.

In case it's useful for testing: You're welcome to check in the example OFX I posted.

csingley commented 3 years ago

I've got more fundamental issues at the moment... like, say, the sample server response from section 13.12 of the OFX v203 spec that puts INV401KBAL before INV401K under INVSTMTRS... which is apparently the sequence I copied, to make that test pass.

Anyway, the schema document (OFX_Investment_Messages.xsd) clearly specifies that INV401K comes before INV401KBAL; the person who wrote that example seems to have screwed it up. I'll pass it up to the OFX Consortium and ask them to fix this in the next version of the spec.

aclindsa commented 3 years ago

I've got more fundamental issues at the moment... like, say, the sample server response from section 13.12 of the OFX v203 spec that puts INV401KBAL before INV401K under INVSTMTRS... which is apparently the sequence I copied, to make that test pass.

Anyway, the schema document (OFX_Investment_Messages.xsd) clearly specifies that INV401K comes before INV401KBAL; the person who wrote that example seems to have screwed it up.

Yikes, I see the discrepancy now. So you still believe Fidelity's OFX is well-formed, correct?

I'll pass it up to the OFX Consortium and ask them to fix this in the next version of the spec.

I can't tell if you're serious or joking over the internet. Have you seen this: https://financialdataexchange.org/FDX/News/Press-Releases/OFX_Joins_FDX.aspx? I'm wondering if there's going to be a next version. (I'd love to discuss this further outside this issue if you're interested. My email can be found on the website in my github profile).

csingley commented 3 years ago

So you still believe Fidelity's OFX is well-formed, correct?

Correct. The OFX spec XML schema definition for INVSTMTRS makes this clear. Also note that the example in OFX section 13.12 contains an additional error... <INV401KSUMMARY> contains an opening <YEARTODATE> tag without a corresponding closing </YEARTODATE> tag. Needless to say this is a gross violation of XML well-formedness; nothing about this example should be taken as authoritative in any way.

Thanks for the heads-up on FDX, I hadn't been aware of it. Seems like bad news for the free software community; they won't even let me register as an individual to view API documentation, because they're trying to "protect their intellectual property". So much for open standards.

You can always drop me a line; working email address for me is in the git log as well as ofxtools.__author_email__.

Thanks again for your help with this bug.