dipu-bd / lightnovel-crawler

Generate and download e-books from online sources.
https://pypi.org/project/lightnovel-crawler/
GNU General Public License v3.0
1.43k stars 279 forks source link

wuxiaworld.com unable to detect chapter for v3.2.7 and using subscription champion account #1995

Closed freaking1990 closed 6 months ago

freaking1990 commented 1 year ago

Novel URL: (https://www.wuxiaworld.com/novel/overgeared) App Version: 3.2.6 - 3.2.7

Describe this issue

Hello, first time creating post issues. I tried to create epub for overgeared. but it seems it doesn't detect the chapter after i login my account with and without the token bearer. i have tried on both version 3.2.6 and 3.2.7 with no luck detecting chapter.

but on 3.2.3 it was successful detect chapter after login. but it densest detect my login as "subcription champion" which unlock all chapter except the early access chapter. when it compile. it only compile teaser chapter.

my account bear the "champion subscription" which unlock all the chapter for the specific novel : fyi

best regards

===============issues=================

C:\Users\fxxxx>lncrawl --login Bearer 3xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx-1 -s https://www.wuxiaworld.com/novel/overgeared

                      [#] Lightnovel Crawler v3.2.7
              https://github.com/dipu-bd/lightnovel-crawler

-> Press Ctrl + C to exit

Retrieving novel info... Exception in thread Thread-1 (read_novel_info): cloudscraper.exceptions.CloudflareChallengeError: Detected a Cloudflare version 2 Captcha challenge, This feature is not available in the opensource (free) version.

During handling of the above exception, another exception occurred:

selenium.common.exceptions.InvalidArgumentException: Message: invalid argument: unrecognized capability: quietExceptionsStacktrace: Backtrace: GetHandleVerifier [0x00F1A813+48355] (No symbol) [0x00EAC4B1] (No symbol) [0x00DB5358] (No symbol) [0x00DCAC31] (No symbol) [0x00DFFCEE] (No symbol) [0x00DFF9A8] (No symbol) [0x00E00AD7] (No symbol) [0x00E0093C] (No symbol) [0x00DFA536] (No symbol) [0x00DD82DC] (No symbol) [0x00DD93DD] GetHandleVerifier [0x0117AABD+2539405] GetHandleVerifier [0x011BA78F+2800735] GetHandleVerifier [0x011B456C+2775612] GetHandleVerifier [0x00FA51E0+616112] (No symbol) [0x00EB5F8C] (No symbol) [0x00EB2328] (No symbol) [0x00EB240B] (No symbol) [0x00EA4FF7] BaseThreadInitThunk [0x75CF00C9+25] RtlGetAppContainerNamedObjectPath [0x77D77B4E+286] RtlGetAppContainerNamedObjectPath [0x77D77B1E+238]

! Error: No chapters found


================================================================================ [#] Lightnovel Crawler v3.2.7 https://github.com/dipu-bd/lightnovel-crawler

-> Press Ctrl + C to exit

? Enter novel page url or query novel: https://www.wuxiaworld.com/novel/overgeared ? Do you want to log in? Yes ? User/Email: fxxxxxxxxx0 ? Password: **** Retrieving novel info... Exception in thread Thread-1 (read_novel_info): cloudscraper.exceptions.CloudflareChallengeError: Detected a Cloudflare version 2 Captcha challenge, This feature is not available in the opensource (free) version.

During handling of the above exception, another exception occurred:

selenium.common.exceptions.InvalidArgumentException: Message: invalid argument: unrecognized capability: quietExceptions Stacktrace: Backtrace: GetHandleVerifier [0x00F1A813+48355] (No symbol) [0x00EAC4B1] (No symbol) [0x00DB5358] (No symbol) [0x00DCAC31] (No symbol) [0x00DFFCEE] (No symbol) [0x00DFF9A8] (No symbol) [0x00E00AD7] (No symbol) [0x00E0093C] (No symbol) [0x00DFA536] (No symbol) [0x00DD82DC] (No symbol) [0x00DD93DD] GetHandleVerifier [0x0117AABD+2539405] GetHandleVerifier [0x011BA78F+2800735] GetHandleVerifier [0x011B456C+2775612] GetHandleVerifier [0x00FA51E0+616112] (No symbol) [0x00EB5F8C] (No symbol) [0x00EB2328] (No symbol) [0x00EB240B] (No symbol) [0x00EA4FF7] BaseThreadInitThunk [0x75CF00C9+25] RtlGetAppContainerNamedObjectPath [0x77D77B4E+286] RtlGetAppContainerNamedObjectPath [0x77D77B1E+238]

! Error: No chapters found


alzamer2 commented 1 year ago

hello after reading #1991 i tried to uninstall latest selenium and reinstall selenium 4.9.0 the code worked for free chapter but for paid chapter he was unable to log in so it was only teaser that got scraped no idea if that because they changed some thing in site or selenium

@freaking1990 try doing this in commandline type pip uninstall selenium pip install selenium==4.9.0

then run the script and tell me what happed

freaking1990 commented 1 year ago

@alzamer2 Hello, i have tried uninstall and install selenium==4.9.0 with success.

this time i can found the chapter. but cannot compile latest chapter. it will only load teaser chapter. sad. thank you for helping though.

freaking1990 commented 1 year ago

updated,

i have tried with the version 3.2.8. and manage to detect the chapter. But only the free chapter. the apps not detected the subscription champion and will only download teaser chapter.

alzamer2 commented 1 year ago

i tried to check the wuxai coding there is to way to scrap chapter

  1. by using api, but at moment its blocked with cloudflare
  2. by using selenium, but it does not support logging in
alzamer2 commented 9 months ago

hello @freaking1990 i submitted pull #2204 , which is fix that allowed you to scrap paid chapter (including champion chapters) try it out and tell me how it work

update: hello new update was issued for wuxiaworld.com update your sources and try scraping