drunkdream / weread-exporter

将微信读书中的书籍导出成epub、pdf、mobi等格式
1.25k stars 189 forks source link

爬取自上传书籍时因为获取标题导致爬取错误 #93

Open wzj042 opened 3 months ago

wzj042 commented 3 months ago

请问是否能支持自行上传书籍的爬取?有一些上传的内容当时没做好保存现在只能在网页上看(

Traceback (most recent call last):
  File "C:\Users\XQH\scoop\apps\python38\current\lib\runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "C:\Users\XQH\scoop\apps\python38\current\lib\runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "E:\WorkPlace\Python\Spider\weread-exporter\weread_exporter\__main__.py", line 158, in <module>
    main()
  File "E:\WorkPlace\Python\Spider\weread-exporter\weread_exporter\__main__.py", line 154, in main
    loop.run_until_complete(async_main())
  File "C:\Users\XQH\scoop\apps\python38\current\lib\asyncio\base_events.py", line 616, in run_until_complete
    return future.result()
  File "E:\WorkPlace\Python\Spider\weread-exporter\weread_exporter\__main__.py", line 92, in async_main
    await exporter.export_markdown(args.load_timeout, args.load_interval)
  File "E:\WorkPlace\Python\Spider\weread-exporter\weread_exporter\export.py", line 306, in export_markdown
    meta_data = await self._load_meta_data()
  File "E:\WorkPlace\Python\Spider\weread-exporter\weread_exporter\export.py", line 46, in _load_meta_data
    self._meta_data = await self._page.get_book_info()
  File "E:\WorkPlace\Python\Spider\weread-exporter\weread_exporter\webpage.py", line 47, in get_book_info
    book_info["title"] = data["reader"]["bookInfo"]["title"]
KeyError: 'title'
wzj042 commented 3 months ago

替换了一些参数后出现以下执行结果

E:\WorkPlace\Python\Spider\weread-exporter>python -m weread_exporter -b {$book_id} -o epub -o pdf --force-login
Fontconfig error: Cannot load default config file
C:\Users\XQH\scoop\apps\python38\current\lib\site-packages\weasyprint\fonts.py:215: UserWarning: @font-face not supported: FontConfig cannot load default config file
  warnings.warn(
C:\Users\XQH\scoop\apps\python38\current\lib\site-packages\weasyprint\fonts.py:457: UserWarning: Expect ugly output with font-size: 0
  warnings.warn('Expect ugly output with font-size: 0')
C:\Users\XQH\scoop\apps\python38\current\lib\site-packages\weasyprint\document.py:35: UserWarning: There are known rendering problems and missing features with cairo < 1.15.4. WeasyPrint may work with older versions, but please read the note about the needed cairo version on the "Install" page of the documentation before reporting bugs. http://weasyprint.readthedocs.io/en/latest/install.html
  warnings.warn(
[2024-07-26 16:31:55,805][INFO]Exporting book {book_id}
[2024-07-26 16:31:56,048][INFO][WeReadWebPage] Launch url https://weread.qq.com/web/reader/{book_id}
[2024-07-26 16:31:56,594][INFO]Browser listening on: ws://127.0.0.1:3918/devtools/browser/f5f5bf2c-99d9-41f1-915f-d6b25ba058a5
{inject cookies}
[2024-07-26 16:31:58,860][INFO]terminate chrome process...
[2024-07-26 16:31:59,169][INFO]Save file output\测试本地书籍.epub complete
[2024-07-26 16:31:59,216][WARNING]Ignored `background: var(--alice-bg-color)` at 6:5, invalid value.
[2024-07-26 16:31:59,217][WARNING]Ignored `overflow-y: hidden` at 122:5, unknown property.
[2024-07-26 16:31:59,218][WARNING]Ignored `border-top: 1px solid var(--alice-font-color)` at 
166:5, invalid value.
[2024-07-26 16:31:59,219][WARNING]Ignored `border-bottom: 2px solid var(--alice-font-color)` 
at 170:5, invalid value.
[2024-07-26 16:31:59,219][WARNING]Ignored `border-bottom: 2px solid var(--alice-font-color)` 
at 175:5, invalid value.
[2024-07-26 16:31:59,220][WARNING]Ignored `border-top: 2px solid var(--alice-font-color)` at 
188:5, invalid value.
Fontconfig error: Cannot load default config file

(python.exe:6616): Pango-WARNING **: couldn't load font "Helvetica Not-Rotated 14px", falling back to "Sans Not-Rotated 14px", expect ugly output.
[2024-07-26 16:31:59,320][INFO]Save file output\测试本地书籍.pdf complete

其中 pdf 只保存了测试的封面图片, epub 只有替换的的测试信息

hikari-2024 commented 2 months ago

我也有这个问题。

drunkdream commented 2 months ago

这个工具不支持自己上传的书籍