lightnovel-center / linovelib2epub

Crawl light novel from some websites and convert it to epub.
GNU Affero General Public License v3.0
79 stars 10 forks source link

[BUG] The content of [url] is Empty and content_id =acontentz. #45

Closed Kuan-Lun closed 4 months ago

Kuan-Lun commented 7 months ago

Describe the bug(描述这个BUG) A clear and concise description of what the bug is.

跳出的瀏覽器正常顯示,但錯誤訊息卻顯示為空。

To Reproduce(复现步骤) 复现的代码以及操作(例如分支选择、卷选择等等)

python 檔案

from linovelib2epub import Linovelib2Epub

if __name__ == "__main__":
    linovelib_epub = Linovelib2Epub(
        book_id=2264,
        select_volume_mode=True,
        log_level="DEBUG",
    )
    linovelib_epub.run()

執行

(.venv) PS D:\tmp\linovelib2epub> python .\main.py
2024-04-24,01:52:06 INFO     LinovelibMobileSpider _html_content_id=acontentz              linovelib_mobile_spider.py:33
                    INFO     LinovelibMobileSpider len(_mapping_dict)=104                  linovelib_mobile_spider.py:35
2024-04-24,01:52:07 INFO     LinovelibMobileSpider Succeed to get the novel of book_id:    linovelib_mobile_spider.py:88
                             2264
                    INFO     LinovelibMobileSpider book name:《月光下的异世界之旅》        linovelib_mobile_spider.py:98
                    INFO     LinovelibMobileSpider Succeed to get the catalog of book_id: linovelib_mobile_spider.py:148
                             2264
[?] Which volumes you want to download?(use SPACE to select one or multiple volumes):
 > [X] 第一卷
   [ ] 第二卷
   [ ] 第三卷
   [ ] 第四卷
   [ ] 第五卷
   [ ] 第六卷
   [ ] 第七卷
   [ ] 第八卷
   [ ] 第8.5卷
   [ ] 第九卷
   [ ] 第十卷
   [ ] 第十一卷
   [ ] 第十二卷

2024-04-24,01:52:10 INFO     LinovelibMobileSpider volume: 第一卷                         linovelib_mobile_spider.py:164
                    INFO     LinovelibMobileSpider chapter : 插图                         linovelib_mobile_spider.py:178

DevTools listening on ws://127.0.0.1:64783/devtools/browser/87ee5b70-09c3-4015-a5bf-bc617950e392
2024-04-24,01:52:12 INFO     LinovelibMobileSpider navigator.language.toLowerCase()=zh-tw linovelib_mobile_spider.py:368
2024-04-24,01:52:14 INFO     LinovelibMobileSpider  初始化 Driver 完毕...                 linovelib_mobile_spider.py:379
2024-04-24,01:52:25 WARNING  LinovelibMobileSpider                                        linovelib_mobile_spider.py:306
                             https://www.bilinovel.com/novel/2264/122121.html encountered
                             TimeoutException.
                    WARNING  LinovelibMobileSpider Retrying                               linovelib_mobile_spider.py:327
                             https://www.bilinovel.com/novel/2264/122121.html(1/10)...;
                             retry_interval: 1.76(s)
2024-04-24,01:52:28 DEBUG    LinovelibMobileSpider                                        linovelib_mobile_spider.py:202
                             page(https://www.bilinovel.com/novel/2264/122121.html)
                             size=17875
                    INFO     LinovelibMobileSpider chapter : [插图] New Title= [插圖]     linovelib_mobile_spider.py:210
                    CRITICAL LinovelibMobileSpider The content of                         linovelib_mobile_spider.py:224
                             https://www.bilinovel.com/novel/2264/122121.html is Empty
                             and content_id =acontentz.Please report this bug to [github
                             issue](https://github.com/lightnovel-center/linovelib2epub/i
                             ssues).

log 檔

2024-04-24,01:57:13 INFO     LinovelibMobileSpider  linovelib_mobile_spider.py:33    _html_content_id=acontentz
2024-04-24,01:57:13 INFO     LinovelibMobileSpider  linovelib_mobile_spider.py:35    len(_mapping_dict)=104
2024-04-24,01:57:14 INFO     LinovelibMobileSpider  linovelib_mobile_spider.py:88    Succeed to get the novel of book_id: 2264
2024-04-24,01:57:14 INFO     LinovelibMobileSpider  linovelib_mobile_spider.py:98    book name:《月光下的异世界之旅》
2024-04-24,01:57:14 INFO     LinovelibMobileSpider  linovelib_mobile_spider.py:148   Succeed to get the catalog of book_id: 2264
2024-04-24,01:57:17 INFO     LinovelibMobileSpider  linovelib_mobile_spider.py:164   volume: 第一卷
2024-04-24,01:57:17 INFO     LinovelibMobileSpider  linovelib_mobile_spider.py:178   chapter : 插图
2024-04-24,01:57:19 INFO     LinovelibMobileSpider  linovelib_mobile_spider.py:368   navigator.language.toLowerCase()=zh-tw
2024-04-24,01:57:23 INFO     LinovelibMobileSpider  linovelib_mobile_spider.py:379    初始化 Driver 完毕...
2024-04-24,01:57:25 DEBUG    LinovelibMobileSpider  linovelib_mobile_spider.py:202   page(https://www.bilinovel.com/novel/2264/122121.html) size=16861
2024-04-24,01:57:25 INFO     LinovelibMobileSpider  linovelib_mobile_spider.py:210   chapter : [插图] New Title= [插圖]
2024-04-24,01:57:25 CRITICAL LinovelibMobileSpider  linovelib_mobile_spider.py:224   The content of https://www.bilinovel.com/novel/2264/122121.html is Empty and content_id =acontentz.Please report this bug to [github issue](https://github.com/lightnovel-center/linovelib2epub/issues).

Expected behavior(期望的行为) A clear and concise description of what you expected to happen.

可以正常抓取內容

Screenshots or Video(截图或者视频录制) If applicable, add screenshots to help explain your problem.

錯誤前有正常顯示網頁

https://github.com/lightnovel-center/linovelib2epub/assets/33048725/5930743f-1d6e-4a20-8c59-5fc127a9c11b

Environment(软件环境)

wdpm commented 7 months ago

你discord还在线吗?

wdpm commented 7 months ago

注意查看文档的更新,https://github.com/lightnovel-center/linovelib2epub?tab=readme-ov-file#linovelibmobile

现在繁体版必须显式指定 target 来进行请求。

from linovelib2epub import Linovelib2Epub, TargetSite

if __name__ == "__main__":
    linovelib_epub = Linovelib2Epub(
        book_id=2264,
        select_volume_mode=True,
        log_level="DEBUG",
        target_site=TargetSite.LINOVELIB_MOBILE_TRADITIONAL,
        # 是否挂代理取决于你的网络环境,我这边需要挂代理才能正常访问 bilinovel 的繁体版网站
        # 你那边应该是不需要的,可以注释掉下面这行
        # disable_proxy=False
    )
    linovelib_epub.run()

截图 image

Kuan-Lun commented 6 months ago

我的錯,添加 target_site 可以正常運作。

要不要考慮把 target_site 改成一定要輸入的 input? 這也許能防呆。

wdpm commented 6 months ago

防呆和方便不可兼得。

因为默认请求的目标是简体版网站,所以不需要指定。而需要繁体版的需要显式指定。 如果将这个参数改成必须显式指定,那么简体版的用户(例如我自己)在使用时就必须多打一个参数。

默认目标只有一个,而网站版本有两个,因此不可能同时满足两个版本的用户。

我不同意防呆设计,强制参数的话,使用文档中options那个表,如果全部都是防呆的强制参数,这个软件我自己都不会去用,太麻烦了。我之所以把使用的API设计的非常简单,就是因为我就想简单运行,按需调整。

这个设计也符合一般软件的设计原则,当默认设置不能满足你时,修改设置的主动权在于你。

Kuan-Lun commented 6 months ago

或許可以考慮在錯誤訊息中提及 target_site 設定?因為他要求的是回報 issue,而不是更改 target_site。這樣的話也是一種防呆,只是在執行時才防。

 CRITICAL LinovelibMobileSpider The content of                         linovelib_mobile_spider.py:224
                             https://www.bilinovel.com/novel/2264/122121.html is Empty
                             and content_id =acontentz.Please report this bug to [github
                             issue](https://github.com/lightnovel-center/linovelib2epub/i
                             ssues).
wdpm commented 6 months ago

可以,后续我会改下这个情况出现时的文本描述。发生这个issue是存在多个可能的原因的:

  1. linovelib网站更新源码了,结构改变,这个提示是为了上报变更,后续我需要更新代码来适配。
  2. 用户可能处于非中国大陆地区的网络ip环境,但是想要请求繁体版网站,却没有指定target site。

这两种情况都会造成无法正常抓取正文内容。我会尽量在log中就近给出提示。因为最好的文档肯定也不是readme,而是近在咫尺的log提示。

感谢你的反馈和建议。

wdpm commented 6 months ago

image 修改了提示。

github-actions[bot] commented 4 months ago

This issue is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 0 days.

github-actions[bot] commented 4 months ago

This issue was closed because it has been stalled for 0 days with no activity.