Open GoogleCodeExporter opened 8 years ago
The code I provided does not account for text inside <blockquote> blocks, as in
the case of RFC 4463. Here's a cleaner fix (worked for me):
Create a sanitizing function containing the code I suggested earlier, and call
it when needed. E.g., append this to html.py:
def sanitizeSpecChars(line):
# Sanitize &, <, and >
line=re.sub('&','&',line)
line=re.sub('<','<',line)
line=re.sub('>','>',line)
return line
Then call it inside the "for i in outputlines" loop in outputTextBlock(self),
right before the call to self.output.write(), e.g.:
for i in outputlines:
i=sanitizeSpecChars(i) # here
self.output.write(...
And call it right before the regex matching logic in writeContent(self, line),
e.g.:
if isRFCPageBreaker(line):
return getattr(self, "writeContent")
line=sanitizeSpecChars(line) # here
re.match(r'^d+\.?\s.*',...
It would probably make sense to sanitize the ToC and Abstract too, but I'm too
tired to do that right now.
Original comment by proverbs...@gmail.com
on 14 Feb 2012 at 7:26
Original issue reported on code.google.com by
proverbs...@gmail.com
on 13 Feb 2012 at 5:48