dteviot / WebToEpub

A simple Chrome (and Firefox) Extension that converts Web Novels (and other web pages) into an EPUB.
Other
738 stars 140 forks source link

Please add site https://old.ranobelib.me/ #1580

Open DEATHAN-SAA opened 1 day ago

DEATHAN-SAA commented 1 day ago

Hello. Apologies if my English is not perfect. I'm working with this ↓ изображение

Hostname: https://old.ranobelib.me/ CSS Selector for the content: .reader-container CSS Selector for Title of Chapter: div.reader-header-actions:nth-child(3) > div:nth-child(2) > div:nth-child(2)

изображение

Provide URL for web page that contains Table of Contents (list of chapters) of a typical story on the site https://old.ranobelib.me/old/manga/80001--this-marriage-is-bound-to-fail-anyway?ui=6032&section=content

1) I cannot understand why it does not find the first chapter. изображение изображение

2) Not all chapters are displayed at once - its maximum is 72 chapters, if you scroll down the list of chapters in advance. изображение

In this case, I scrolled down a little and found only up to chapter 43. When the story is only 60 chapters, it's not a problem, but for more than that, you have to scroll through the table of contents each time. Stopped at chapter 43 - then you need to scroll to about +24 and click on WebToEpub again. Copy the new links and paste them into the previous WebToEpub window to get all the chapters of the story. And if you scrolled further, for example, to chapter 78 - then WebToEpub will start from chapter 50. You'll have to go back to the table of contents and scroll the mouse wheel again to get chapters 44 to 49 by clicking on WebToEpub. изображение изображение изображение изображение

If a story has over 300 chapters, it becomes very tiring... And WebToEpub can easily skip a chapter somewhere in the list when scrolling. Therefore, you have to check the number of chapters using Excel every time. изображение изображение

Unfortunately, I still haven't figured out how to create new parser. :(

Thank you very much for WebToEpub! I will be grateful to you for the answer.

dteviot commented 1 day ago

Looking at the HTML, the list of chapters is almost certainly in the <script> element which starts with

image

I think the URL for each chapter is taken using the "volume" and "number" members to make part of the chapter's URL. e.g. for https://old.ranobelib.me/old/manga/80001--this-marriage-is-bound-to-fail-anyway?section=content

A recent chapter is

https://old.ranobelib.me/old/80001--this-marriage-is-bound-to-fail-anyway/read/v3/c478

Which has number of 478, volume of 3 and name of "Экстра 6", Although I'm not quite sure how to create the title in the list. I think it's something like Volume X, Chapter Y, Title Z, in Russian.

Can probably use something like this to extract the JSON

        let startString = "window.__CONTENT__ = ";
        let scriptElement = [...dom.querySelectorAll("script")]
            .filter(s => s.textContent.includes(startString))[0];
        return util.locateAndExtractJson(scriptElement.textContent, startString)

And then construct the table of content entries.

Notes Time taken: 43 minutes

gamebeaker commented 1 day ago

@dteviot you could use "[placeholder]"