Here's a patch that "fixes" the issue, but I don't fully
understand what's going on. It also illustrates that
beautiful soup is getting fed the whole document.
--- a/ofxparse/ofxparse.py
+++ b/ofxparse/ofxparse.py
@@ -191,8 +191,12 @@ class OfxPreprocessedFile(OfxFile):
tag_name = re.findall(r'(?i)<([a-z0-9_\.]+)>', token)[0]
if tag_name.upper() not in closing_tags:
last_open_tag = tag_name
- new_fh.write(token)
+
+ if not is_processing_tag:
+ new_fh.write(token)
+
new_fh.seek(0)
+ print new_fh.getvalue()
self.fh = new_fh
Here is a sanitized document that exhibits the behaviour
I suspect this is beautiful soup being rubbish.
Here is my version of beautiful soup
Here's a patch that "fixes" the issue, but I don't fully understand what's going on. It also illustrates that beautiful soup is getting fed the whole document.
Here is a sanitized document that exhibits the behaviour