aerkalov / ebooklib

Python E-book library for handling books in EPUB2/EPUB3 format -
https://ebooklib.readthedocs.io/
GNU Affero General Public License v3.0
1.48k stars 231 forks source link

How to add new chapters to existing EPUB files? #217

Open edmund-zhao opened 3 years ago

edmund-zhao commented 3 years ago
addc = epub.EpubHtml(title='第二章', file_name='chapter_add.xhtml', lang='zh-CN')
addc.content = '<h1>第二章</h1><p>这是测试的第二章</p>'
book.add_item(addc)
book.spine.extend((addc,))

Now, How did I add the Book.toc ?

aerkalov commented 3 years ago

Toc is really a tuple/list with TOC elements. It is a tuple/list, you can either add chapter object to it (in your case addc) or add custom link like epub.Link('intro.xhtml', 'Introduction', 'intro'). In the first case title of the chapter in ToC will be taken from the chapter object and in the second case we defined it ourselves.

You can see here how it is constructed - http://docs.sourcefabric.org/projects/ebooklib/en/latest/tutorial.html#creating-epub and you can find that example here - https://github.com/aerkalov/ebooklib/blob/master/samples/03_advanced_create/create.py

edmund-zhao commented 3 years ago

If I add chapters to an existing EPUB file, it will overwrite the existing TOC. code like this

book = epub.read_epub(file_path)
addc = epub.EpubHtml(title='第二章', file_name='chapter_add.xhtml', lang='zh-CN')
addc.content = '<h1>第二章</h1><p>这是测试的第二章</p>'
# Add The Chapter To Object
book.add_item(addc)
book.spine.extend((addc,))
book.toc = (epub.Link('intro.xhtml', '简介', 'intro'),
                    (epub.Section('正文'),
                     (addc,))
                    )

Have a good day!

aerkalov commented 3 years ago

Correct. It will overwrite it. book.toc is really a list of items so just insert it somewhere where you want. Even book.toc.append(addc) would do.

edmund-zhao commented 3 years ago

Correct. It will overwrite it. book.toc is really a list of items so just insert it somewhere where you want. Even book.toc.append(addc) would do.

yes, It useful. But it has a bug. If I use book.toc.extend() twice, the second will overwrite the first

edmund-zhao commented 3 years ago

First Run

from ebooklib import epub
book = epub.read_epub('./test.epub')
addc = epub.EpubHtml(title='The Second Chapter', file_name='chapter_add.xhtml', lang='zh-CN')
addc.content = '<h1>The Second Chapter</h1><p>This is The Second Chapter</p>'
book.add_item(addc)
book.spine.extend((addc,))
book.toc.extend((addc,))
epub.write_epub('./test.epub', book, {})
edmund-zhao commented 3 years ago

Second Run

from ebooklib import epub
book = epub.read_epub('./test.epub')
addc = epub.EpubHtml(title='The Third Chapter', file_name='chapter_add2.xhtml', lang='zh-CN')
addc.content = '<h1>The Third Chapterr</h1><p>This is The Third Chapter</p>'
book.add_item(addc)
book.spine.extend((addc,))
book.toc.extend((addc,))
epub.write_epub('./test.epub', book, {})

The second Chapter will be dismissed

edmund-zhao commented 3 years ago

Correct. It will overwrite it. book.toc is really a list of items so just insert it somewhere where you want. Even book.toc.append(addc) would do.

The reason is that content.opf of Epub isn't able to fresh, which lead to the second running book = epub.read_epub can't get the first running content.opf

edmund-zhao commented 3 years ago

if I rewrite epub.EpubNcx() and epub.EpubNav, It work!

book = epub.read_epub('./测试.epub')

print(book.get_metadata('DC','date'))
# nav_items = book.get_items_of_type(ebooklib.ITEM_IMAGE)
# # print(nav_items)
# # e = nav_items.get_name()
# # # print(e)
# # t = b'      <navPoint id="chapter_5">\n        <navLabel>\n          <text>2020\xe5\xb9\xb4\xe5\x9c\xa3\xe8\xaf\x9e\xe7\x95\xaa\xe5\xa4\x96</text>\n        </navLabel>\n        <content src="chapter3.xhtml"/>\n      </navPoint>\n'
# # a = e[:-35] + t + e[-35:]
# # nav_items.content = a
# # book.add_item(nav_items)

# all_items = book.get_items()
# u = []
# for item in book.get_items():
#     if item.get_type() == ebooklib.ITEM_NAVIGATION:
#         if 'chapter' in item.get_name():
#             print(item.file_name)
#             u.append(item)
#         e = item.get_content()
#         # print(e)
#         t = b'      <navPoint id="chapter_5">\n        <navLabel>\n          <text>2020\xe5\xb9\xb4\xe5\x9c\xa3\xe8\xaf\x9e\xe7\x95\xaa\xe5\xa4\x96</text>\n        </navLabel>\n        <content src="chapter3.xhtml"/>\n      </navPoint>\n'
#         a = e[:-35] + t + e[-35:]
#         print(a.decode('utf-8'))
#         item.content = a
#         book.add_item(item)
#         break

# index = book.get_item_with_href('chapter0.xhtml')
addc = epub.EpubHtml(title='第三章', file_name='chapter_add3.xhtml', lang='zh-CN',uid="chapter_add3")
addc.content = '<h1>第三章</h1><p>这是测试的第三章</p>'
addcLink = epub.Link('chapter_add3.xhtml','第三章',uid='chapter_add3')
book.add_item(addc)
# u.append(addc)
# book.toc = (epub.Link('intro.xhtml', '简介', 'intro'),
#                     (epub.Section('正文'),
#                      tuple(u))
#                     )
print(book.toc)
print(book.spine)
print("len of toc:", len(book.toc))
print("len of spine:", len(book.spine))
print("**************")
book.toc.append(addc)
book.spine.append(addc)
print(book.toc)
print(book.spine)
print("len of toc:", len(book.toc))
print("len of spine:", len(book.spine))
book.add_item(epub.EpubNcx())
book.add_item(epub.EpubNav())
epub.write_epub('./测试.epub', book, {})
edmund-zhao commented 3 years ago

But It will have duplivated name Warning Like This

C:\Python\Python37\lib\zipfile.py:1506: UserWarning: Duplicate name: 'EPUB/toc.ncx'
  return self._open_to_write(zinfo, force_zip64=force_zip64)
C:\Python\Python37\lib\zipfile.py:1506: UserWarning: Duplicate name: 'EPUB/nav.xhtml'
  return self._open_to_write(zinfo, force_zip64=force_zip64)
aerkalov commented 3 years ago

Ok, so out of the head solution would be to do this before you create new ncx and nav. You remove them (they have content of old Table of contents anyway) from the items and just add them again.

book.items.remove(book.get_item_with_id('ncx'))
book.items.remove(book.get_item_with_id('nav'))

book.add_item(epub.EpubNcx())
book.add_item(epub.EpubNav())
edmund-zhao commented 3 years ago

Thanks for your help, I will write a blog to introduce the Ebooklib project

—— Edmund Zhao

在 2021年2月17日,上午7:31,Aleksandar Erkalović notifications@github.com 写道:

 EXTERNAL EMAIL:

Ok, so out of the head solution would be to do this before you create new ncx and nav. You remove them (they have content of old Table of contents anyway) from the items and just add them again.

book.items.remove(book.get_item_with_id('ncx')) book.items.remove(book.get_item_with_id('nav'))

book.add_item(epub.EpubNcx()) book.add_item(epub.EpubNav())

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/aerkalov/ebooklib/issues/217#issuecomment-780185806, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AM26ODDAVOASHGBVYAAJP33S7L56TANCNFSM4WKL3DDQ.