Open shreevatsa opened 6 months ago
Ouch, notice this giant "line" (because of the X):
(Google OCR result for the same page as previous comment; the screenshot in previous comment was from Tesseract)
Now that we have individual words in line.attrs
, we could in principle re-form lines. (What to do with changed text though?)
This line is because of the various dots I think:
So instead of something operating on line.attrs.words
, may be better to just implement spliting manually (#8).
(Also because we may get rid of line.attrs.words
: #21.)
Sometimes a
line
can span multiple lines, and right now there's no way to insert a line break within a line.We could either:
line
be a<p>
, so that it contain line breaks internally.