traut / feeds2fb2

Convert RSS/Atom feeds into fb2 book
0 stars 0 forks source link

Bad HTML #5

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?
1. Create opml file wiht one feed http://hegtor.livejournal.com/data/rss
2. Run applicaion -t opml --enable-pics --book-per-feed --disable-zip

What is the expected output? 
Running xmllint --format should be showing well-formed indented xml

What do you see instead?
feeds/Unlabeled%20-%20hegtor.fb2:41: parser error : EntityRef: expecting 
';'
rong></p><p><p href="http://www.cdep.ru/statistics.asp?
search_frm_auto=1&dept_id

^
feeds/Unlabeled%20-%20hegtor.fb2:44: parser error : Opening and ending tag 
mismatch: section line 43 and p
╬я└п╣я│я│п╦п╬пҐп╟п╩п╦пЇп╪п╟, п╟ 
пҐп╣ п╡я▀я│п╬п╨п╬пЁп╬ 
п╦я│п╨я┐я│я│я┌п╡п╟</p></p>

^
feeds/Unlabeled%20-%20hegtor.fb2:45: parser error : Opening and ending tag 
mismatch: body line 25 and section
</section>
          ^
feeds/Unlabeled%20-%20hegtor.fb2:50: parser error : Opening and ending tag 
mismatch: section line 49 and p
пЄп╦п╬ я│п╩п╟п╧пЄ я┬п╬я┐. п²п╟ 
я█я┌п╬ я─п╟пЇ пЁп╬п╩п╬я│ пҐп╣ 
п╪п╬п╧.</p></p></p>

^
feeds/Unlabeled%20-%20hegtor.fb2:51: parser error : Opening and ending tag 
mismatch: FictionBook line 2 and section
</section>
          ^
feeds/Unlabeled%20-%20hegtor.fb2:52: parser error : Extra content at the 
end of the document
<section>
^

What version of the product are you using? On what operating system?
4.1.3 on linux python 2.6.1

Please provide any additional information below.

Original issue reported on code.google.com by ivan.kov...@gmail.com on 31 Mar 2009 at 6:25

GoogleCodeExporter commented 9 years ago
May be application can make use of Tidy lib? (with python wrapper on 
http://utidylib.berlios.de/?

Original comment by ivan.kov...@gmail.com on 31 Mar 2009 at 6:26