Closed kelson42 closed 9 years ago
Thx, I have restarted an export, feedback probably not before tomorrow.
Still crash at the same book, but error slightlyt different
I reopen this ticket with a new similar crash:
Exporting Book #13103.
Exporting to static/Great Britain and Her Queen.13103.html
Copying companion file to 13103_062 William Whewell, DD.jpg
Copying /media/data/gutenberg/static/13103_062 William Whewell, DD.jpg
Copying companion file to 13103_037 Lord Palmerston.jpg
Copying /media/data/gutenberg/static/13103_037 Lord Palmerston.jpg
Copying companion file to 13103_004 Kensington Palace.jpg
Copying /media/data/gutenberg/static/13103_004 Kensington Palace.jpg
Copying companion file to 13103_001 Queen Victoria.jpg
Copying /media/data/gutenberg/static/13103_001 Queen Victoria.jpg
Copying companion file to 13103_061 Thomas Carlyle.jpg
Copying /media/data/gutenberg/static/13103_061 Thomas Carlyle.jpg
Copying companion file to 13103_032 Sir John Lawrence.jpg
Copying /media/data/gutenberg/static/13103_032 Sir John Lawrence.jpg
Copying companion file to 13103_053 Robert Southey.jpg
Copying /media/data/gutenberg/static/13103_053 Robert Southey.jpg
Copying companion file to 13103_040 The Mausoleum.jpg
Copying /media/data/gutenberg/static/13103_040 The Mausoleum.jpg
Copying companion file to 13103_092 Wesley preaching on his father's tomb.jpg
Copying /media/data/gutenberg/static/13103_092 Wesley preaching on his father's tomb.jpg
Traceback (most recent call last):
File "./dump-gutenberg.py", line 150, in
Thanks, new test run started.
It seems your last fix has introduced a regression (crashing now early at #2810):
Exporting Book #2809.
Exporting to static/Main-Travelled Roads.2809.html
Copying format file to Main-Travelled Roads.2809.epub
Creating ePUB at /tmp/tmpHetUAn.epub
Exporting to static/Main-Travelled Roads_cover.2809.html
Exporting Book #2810.
Exporting to static/Plunkitt of Tammany Hall: a series of very plain talks on very practical politics, delivered by ex-Senator George Washington Plunkitt, the Tammany philosopher, from his rostrum—the New York County court house bootblack stand; Reco.2810.html
Copying format file to Plunkitt of Tammany Hall: a series of very plain talks on very practical politics, delivered by ex-Senator George Washington Plunkitt, the Tammany philosopher, from his rostrum—the New York County court house bootblack stand; Reco.2810.epub
Creating ePUB at /tmp/tmpT0QPPl.epub
Traceback (most recent call last):
File "./dump-gutenberg.py", line 150, in
It seems that truncating fname from 230 to 210 fixes the bug. But I don't understand why your last commit generates this regression, so let you have a look.
The fix seems to be:
Cmd line arguments seem to be quoted with single quote... so only this character should be escaped... other you add many characters to the filename (and generate other trouble). Please confirm.
I have commited and I close this, pretty sure this is the good solution.
Stil crashing with a quoting issue:
Copying /media/data/gutenberg/static/13103_062 William Whewell, DD.jpg
Copying companion file to 13103_037 Lord Palmerston.jpg
Copying /media/data/gutenberg/static/13103_037 Lord Palmerston.jpg
Copying companion file to 13103_004 Kensington Palace.jpg
Copying /media/data/gutenberg/static/13103_004 Kensington Palace.jpg
Copying companion file to 13103_001 Queen Victoria.jpg
Copying /media/data/gutenberg/static/13103_001 Queen Victoria.jpg
Copying companion file to 13103_061 Thomas Carlyle.jpg
Copying /media/data/gutenberg/static/13103_061 Thomas Carlyle.jpg
Copying companion file to 13103_032 Sir John Lawrence.jpg
Copying /media/data/gutenberg/static/13103_032 Sir John Lawrence.jpg
Copying companion file to 13103_053 Robert Southey.jpg
Copying /media/data/gutenberg/static/13103_053 Robert Southey.jpg
Copying companion file to 13103_040 The Mausoleum.jpg
Copying /media/data/gutenberg/static/13103_040 The Mausoleum.jpg
Copying companion file to 13103_092 Wesley preaching on his father's tomb.jpg
Copying /media/data/gutenberg/static/13103_092 Wesley preaching on his father's tomb.jpg
Traceback (most recent call last):
File "./dump-gutenberg.py", line 154, in
Ah you introduced that one. Doesn't fail with my previous commit
On Wed, Oct 8, 2014 at 1:43 PM, Kelson notifications@github.com wrote:
Reopened #23 https://github.com/kiwix/gutenberg/issues/23.
— Reply to this email directly or view it on GitHub https://github.com/kiwix/gutenberg/issues/23#event-175752132.
The problem of your this patch is that it was adding back slashes before spaces (and I guess also before double quote) on all EPUB files (for example).
I don't know what is the solution, but might that be that the way the command line arguments are quoted dependes from the content (most of the time single quote, but time to time doublequote)?
I think I have fixed that bug https://github.com/kiwix/gutenberg/commit/3343ddef7d0e3091a69df63e5e85fa5826099b20
Traceback (most recent call last): File "./dump-gutenberg.py", line 150, in
main(docopt(help, version=0.1))
File "./dump-gutenberg.py", line 137, in main
only_books=BOOKS)
File "/media/data/gutenberg/gutenberg/export.py", line 155, in export_all_books
books=books)
File "/media/data/gutenberg/gutenberg/export.py", line 378, in export_book_to
new_html = update_html_for_static(book=book, html_content=html)
File "/media/data/gutenberg/gutenberg/export.py", line 275, in update_html_for_static
[1 for e in body.children
AttributeError: 'NoneType' object has no attribute 'children'
-rw-rw-r-- 1 kelson kelson 428404 Sep 29 12:20 static/authors.js
-rw-rw-r-- 1 kelson kelson 428404 Sep 29 12:15 static/authors_lang_en.js