flycutter-zfz / rfc2kindle

Automatically exported from code.google.com/p/rfc2kindle
0 stars 0 forks source link

Some RFC text incorrectly converted to figures #2

Open GoogleCodeExporter opened 8 years ago

GoogleCodeExporter commented 8 years ago
What steps will reproduce the problem?
1. Invoke rfc2kindle/rfc2mobi to convert RFC 4462 (and presumably others)
2. ls ./rfc4462 # includes two JPEGs other than auths.jpg even though there are 
no figures in the RFC. The images contain text, and don't flow/resize with the 
rest of the text in the document.

What is the expected output? What do you see instead?
Expected: Only one HTML file (rfc4462.html) and one authors page image 
(auths.jpg).
Seen instead: One HTML file (rfc4462.html), two images containing text rather 
than figures (img1.jpg, img2.jpg), and one authors page image (auths.jpg).

What version of the product are you using? On what operating system?
OS: Fedora Linux 16 x86_64
rfc2kindle: Latest available via SVN as on 12 February 2012

Please provide any additional information below.
I don't have a patch for this one, but I think the culprit may be the logic in 
html.py, line 218 (isFigureLine = lambda i:....), which is confused in one spot 
in RFC 4462 by the page break formatting, and in another spot because of the 
text:
'1.  C sends "min || n || max" to S, indicating the minimal acceptable'

Original issue reported on code.google.com by proverbs...@gmail.com on 13 Feb 2012 at 6:09

GoogleCodeExporter commented 8 years ago
Here's some code to prevent unordered and ordered lists from being treated as 
figures. This resolves this issue (at least in the case of RFC 4462). I 
replaced isFigureLine's declaration as a lambda expression in html.py with the 
following code:

def isFigureLine(line):
    # If line is an ordered/unordered list, then no
    if re.match(r'^\s*\d\.\s+',line) or re.match(r'^\s*o\s+',line):
        return False
    if string.count(line, '---') > 0:
        return True
    if string.count(line, '|') > 1:
        return True
    if string.count(line, '+') > 1:
        return True
    if string.count(line, '>') > 3:
        return True
    if string.count(line, '<') > 3:
        return True

    return False

Original comment by proverbs...@gmail.com on 14 Feb 2012 at 6:22