csingley / ofxtools

Python OFX Library
Other
301 stars 68 forks source link

STMTRS SubElements out of order #78

Closed dlernstrom closed 4 years ago

dlernstrom commented 4 years ago

Hello,

Thank you for a fantastic package! It's been working great for me, but suddenly started failing with a STMTRS SubElements out of order error. I don't know whether my bank (AFCU) changed something with the content they are sending down, but the content does not parse successfully anymore.

After some analysis of the OFX parser code, I see there is a sequence check and validation - though I couldn't quite understand how the operands were being evaluated. I don't know whether one of these fields (['AVAILBAL', 'BANKACCTFROM', 'BANKTRANLIST', 'CURDEF', 'LEDGERBAL']) shouldn't have been included or whether it appears in the wrong order.

Ultimately I'm looking to understand whether this is: A) a configuration error on my end in how I set up OFXTools, B) A problem with my bank, or C) a problem with OFXTools. If you think it's a problem with OFXTools, I'd love the opportunity to work up a PR and submit it. I saw the notes on the unit testing, coverage, etc. I did upgrade my install already from the current PyPI release to the current tip of the master branch and the issue still remained.

Cheers, David

2019-12-31 18:55:06,834 base.py [257 ] DEBUG Args to instantiate STMTRS: (('availbal', <AVAILBAL(balamt=Decimal('99999.99'), dtasof=datetime.datetime(2019, 12, 31, 18, 55, 5, tzinfo=))>), ('bankacctfrom', <BANKACCTFROM(bankid='123456789', acctid='12345678', accttype='CHECKING')>), ('banktranlist', ), ('curdef', 'USD'), ('ledgerbal', <LEDGERBAL(balamt=Decimal('99999.99'), dtasof=datetime.datetime(2019, 12, 31, 18, 55, 5, tzinfo=))>)) 2019-12-31 18:55:06,856 log.py [228 ] ERROR Internal Server Error: /sync/

Traceback (most recent call last):
  File "/home/dlernstrom/.virtualenvs/money/lib/python3.7/site-packages/django/core/handlers/exception.py", line 34, in inner
    response = get_response(request)
  File "/home/dlernstrom/.virtualenvs/money/lib/python3.7/site-packages/django/core/handlers/base.py", line 115, in _get_response
    response = self.process_exception_by_middleware(e, request)
  File "/home/dlernstrom/.virtualenvs/money/lib/python3.7/site-packages/django/core/handlers/base.py", line 113, in _get_response
    response = wrapped_callback(request, *callback_args, **callback_kwargs)
  File "/home/dlernstrom/.virtualenvs/money/lib/python3.7/site-packages/django/contrib/auth/decorators.py", line 21, in _wrapped_view
    return view_func(request, *args, **kwargs)
  File "/home/dlernstrom/money/money/views.py", line 39, in sync
    ofx = parser.convert()
  File "/home/dlernstrom/.virtualenvs/money/src/ofxtools/ofxtools/Parser.py", line 132, in convert
    instance = Aggregate.from_etree(self._root)
  File "/home/dlernstrom/.virtualenvs/money/src/ofxtools/ofxtools/models/base.py", line 198, in from_etree
    instance = SubClass._convert(elem)
  File "/home/dlernstrom/.virtualenvs/money/src/ofxtools/ofxtools/models/base.py", line 255, in _convert
    args_, specIndices = zip(*[extractArgs(subelem) for subelem in elem])
  File "/home/dlernstrom/.virtualenvs/money/src/ofxtools/ofxtools/models/base.py", line 255, in <listcomp>
    args_, specIndices = zip(*[extractArgs(subelem) for subelem in elem])
  File "/home/dlernstrom/.virtualenvs/money/src/ofxtools/ofxtools/models/base.py", line 239, in extractArgs
    value = Aggregate.from_etree(elem)
  File "/home/dlernstrom/.virtualenvs/money/src/ofxtools/ofxtools/models/base.py", line 198, in from_etree
    instance = SubClass._convert(elem)
  File "/home/dlernstrom/.virtualenvs/money/src/ofxtools/ofxtools/models/base.py", line 255, in _convert
    args_, specIndices = zip(*[extractArgs(subelem) for subelem in elem])
  File "/home/dlernstrom/.virtualenvs/money/src/ofxtools/ofxtools/models/base.py", line 255, in <listcomp>
    args_, specIndices = zip(*[extractArgs(subelem) for subelem in elem])
  File "/home/dlernstrom/.virtualenvs/money/src/ofxtools/ofxtools/models/base.py", line 239, in extractArgs
    value = Aggregate.from_etree(elem)
  File "/home/dlernstrom/.virtualenvs/money/src/ofxtools/ofxtools/models/base.py", line 198, in from_etree
    instance = SubClass._convert(elem)
  File "/home/dlernstrom/.virtualenvs/money/src/ofxtools/ofxtools/models/base.py", line 255, in _convert
    args_, specIndices = zip(*[extractArgs(subelem) for subelem in elem])
  File "/home/dlernstrom/.virtualenvs/money/src/ofxtools/ofxtools/models/base.py", line 255, in <listcomp>
    args_, specIndices = zip(*[extractArgs(subelem) for subelem in elem])
  File "/home/dlernstrom/.virtualenvs/money/src/ofxtools/ofxtools/models/base.py", line 239, in extractArgs
    value = Aggregate.from_etree(elem)
  File "/home/dlernstrom/.virtualenvs/money/src/ofxtools/ofxtools/models/base.py", line 198, in from_etree
    instance = SubClass._convert(elem)
  File "/home/dlernstrom/.virtualenvs/money/src/ofxtools/ofxtools/models/base.py", line 262, in _convert
    raise OFXSpecError(f"{clsnm} SubElements out of order: {subels}")
ofxtools.models.base.OFXSpecError: STMTRS SubElements out of order: ['AVAILBAL', 'BANKACCTFROM', 'BANKTRANLIST', 'CURDEF', 'LEDGERBAL']
csingley commented 4 years ago

It's a little spaghetti-ish because of the model class inheritance.

So at this stage, the parser has constructed a hierarchical tree of xml.etree.ElementTree.Element instances, and passed it as the argument to ofxtools.models.base.Aggregate.from_etree(). That function recursively walks the tree, and looks up each tag in the ofxtools.models namespace.

Your data is blowing up during instantiation of a ofxtools.models.bank.stmt.STMTRS instance. Part of the data validation (inherited from ofxtools.models.base.Aggregate._convert()) checks that child Elements occur in the same order as the class attribute definition (i.e. lines 242-251).

The error is telling you that your STMTRS has children in this order: ['AVAILBAL', 'BANKACCTFROM', 'BANKTRANLIST', 'CURDEF', 'LEDGERBAL']. According to the class definition, these should actually occur in the following order: ['CURDEF', 'BANKACCTFROM', 'BANKTRANLIST', 'LEDGERBAL', 'AVAILBAL']. The sequence causes the parser to reject the conversion, because it violates the OFX specification.

You should check your source data to make sure t.hat the XML elements actually appear in the claimed order.

From the ofxtools side... the reference in the class docstring to the OFX spec is erroneous; we're actually looking for section 11.4.1.2, not section 11.4.2.2. I should fix that (along with all the adjacent class docstrings, which seem to have been incorrectly copy/pasted).

Anyway, looking at the correct section of the OFX spec (i.e. 11.4.1.2), it looks like ofxtools.models.bank.stmt.STMTRS has the sequence correct. So either your FI is sending illegal OFX data, or I've got a serious bug in the functions under oftools.models.base.Aggregate.from_etree().

Let me know which one it is!

dlernstrom commented 4 years ago

@csingley Thank you very much for your timely and thoughtful response. I really appreciate you taking the time to outline in extensive detail what is occurring here. A few thoughts:

I immediately dove into the actual OFX document that the FI provided and you were certainly correct - the elements were provided in what appears to be an alphabetical ordering as opposed to the specification order you explained. A condensed and slightly redacted illustration is below (for anybody else that may be also learning from this exchange):

<STMTRS>
    <AVAILBAL>
        <BALAMT>999999.99
        <DTASOF>20191231185503.000[0:GMT]
    </AVAILBAL>
    <BANKACCTFROM>
        <BANKID>123456789
        <ACCTID>123456789
        <ACCTTYPE>CHECKING
    </BANKACCTFROM>
    <BANKTRANLIST>
        <DTSTART>20191203120000.000[0:GMT]
        <DTEND>20191231185500.202[0:GMT]
    </BANKTRANLIST>
    <CURDEF>USD
    <LEDGERBAL>
        <BALAMT>999999.99
        <DTASOF>20191231185503.000[0:GMT]
    </LEDGERBAL>
</STMTRS>

In an effort to also exhaust my learning of how the OFX specification is written, I also hit the spec to validate for myself that the ordering is explicitly stated. I called up the version 2.2 specification, which was released Nov 26, 2017. Inside version 2.2, the section on STMTRS was listed as 11.4.2.2, which was how your notes were originally written. In your message, you advised that the notes were incorrect and the STMTRS section was 11.4.1.2, so there may be an inconsistency from one version to the next on where that section is defined.

Inside the 2.2 specification, I could not find any reference that indicated that the nodes needed to be in order. This was frustrating to me... until I saw they had a downloadable XSD/DTD. Inside the XSD, it explicitly defines the sequence, which is incontestably ordered:

  <xsd:complexType name="StatementResponse">
    <xsd:annotation>
      <xsd:documentation>
        The OFX element "STMTRS" is of type "StatementResponse"
      </xsd:documentation>
    </xsd:annotation>

    <xsd:sequence>
      <xsd:element name="CURDEF" type="ofx:CurrencyEnum"/>
      <xsd:element name="BANKACCTFROM" type="ofx:BankAccount"/>
      <xsd:element name="BANKTRANLIST" type="ofx:BankTransactionList" minOccurs="0"/>
      <xsd:element name="BANKTRANLISTP" type="ofx:PendingTransactionList" minOccurs="0" maxOccurs="1"/>
      <xsd:element name="LEDGERBAL" type="ofx:LedgerBalance"/>
      <xsd:element name="AVAILBAL" type="ofx:AvailableBalance" minOccurs="0"/>
      <xsd:element name="CASHADVBALAMT" type="ofx:AmountType" minOccurs="0"/>
      <xsd:element name="INTRATE" type="ofx:RateType" minOccurs="0" maxOccurs="1"/>
      <xsd:element name="BALLIST" type="ofx:BalanceList" minOccurs="0"/>
      <xsd:element name="MKTGINFO" type="ofx:InfoType" minOccurs="0"/>
    </xsd:sequence>
  </xsd:complexType>

That being said, I really have 2 pathforwards: 1) I will revert a message to the FI to bring it to their attention that the OFX payload is invalid as per the spec, 2) I will also mitigate the impact by disabling via fork or local modification the ordering validation until (hopefully) a more appropriate longer term solution is implemented with the FI.

Lastly, my mind is still blown by the notion of using instance method definition order to provide a mechanism for validating order of the elements. Although the code is brilliant in its cleverness, I do find it unintuitive. I am a fairly seasoned Python dev and had never even considered defining the order that methods are defined within code as being anything important. (I had to really tug at the cls.spec and the definition of spec and _filter_attrs in the base class definition for Aggregate in models/base.py to really understand what was going on).

Again, brilliant piece of code here and thank you so much for your efforts and investment into the community.

Cheers, David

csingley commented 4 years ago

Yeah, if you look at the DTD it is unambiguous.

It should be pretty easy to disable this particular sanity check. I suppose I could provide a kwarg to make that behavior a user option, but for reasons you can probably imagine, I am reluctant to do so. Your FI's server implementation is really fairly egregious, and you now should have fairly good documentation for a bug report (good luck with that, ha ha!)

You're right about the section numbering... I happened to have the OFXv2.0.3 spec lying at hand when responding to your bug, while obviously I used the OFXv2.2 spec when writing the class docstrings. The 2.2 spec should be considered canonical at this point; I'll leave the docstrings as they are.

Footnote on the implementation: given the huge number of models I needed to write and check, it's always been a basic requirement of the ofxtools API that the ergonomics of mapping from the OFX spec to the model classes be as simple, natural, and straightforward as possible. Given the ordered sequencing in OFX messages, it's always been the case that the order of ofxtools model class attributes has been significant... we originally had to write custom logic to make this happen in Python 2, which used unordered dicts for class instances.

So I've followed the evolution of OrderedDicts in Python 3 (a somewhat obscure topic) perhaps more closely than the average bear. I believe it was Python 3.6 that started using OrderedDicts for classes, and in Python 3.7 the behavior switched to all dicts remembering insertion order. I cheered lustily, and removed perhaps a hundred lines of code & associated tests.

I understand if the result now seems rather arcane... the ordering logic used to be very explicit, but now is almost totally implicit. However, since I'm relying on a hard guarantee of the language (i.e. dicts retain insertion order) I feel comfortable with it. Probably this could be better documented; I'm very open to PRs to improve code comments. Probably also the contortions involving the Aggregate.spec() could be simplified somewhat; to a certain extent they still reflect legacy structure of ordering logic that has since been deleted.

But at heart, I am actually quite fond of the unintuitive modeling here. I guess it's just that it prioritizes the desires of model-writers over those of code-readers, and my own bias here is unsurprising. Sorry for the trouble.

csingley commented 4 years ago

P.S. if you want to see an even more extreme example of how much I love this particular brand of unintuitive cleverness, you can check out the parser and corresponding model classes I wrote for Interactive Brokers XML downloads. The model classes look like this:

@dataclass(frozen=True)
class ConversionRate(FlexElement):
    """ Wrapped in <ConversionRates> """

    reportDate: Optional[datetime.date] = None
    fromCurrency: Optional[str] = None
    toCurrency: Optional[str] = None
    rate: Optional[Decimal] = None

The parser actually instrospecfs type hints out of the model class in order to dispatch the XML data to corresponding type conversion functions. To me, this seems really rather elegant... certainly much better than ofxtools.Types (which is really pretty nasty & janky, no matter how desperately I pretend to sophistication by shoehorning in functools.singledispatch)

I'm afraid my disease may be incurable.