Closed typhoon71 closed 2 years ago
I found a page where the parser... doesn't grab anything, producing an empty epub. Link: "https://tachibanachinatsu.wixsite.com/tenshitranslations/copy-of-obc-toc"
When the empty epub is pached, webtoepub gives this error: 'Warning, content element for web page 'https://tachibanachinatsu.wixsite.com/tenshitranslations' has no visible content.'
I don't understand why this is happening, maybe a new parser is needed for wixsite.com?
@typhoon71
I found a page where the parser... doesn't grab anything, producing an empty epub. Link: "https://tachibanachinatsu.wixsite.com/tenshitranslations/copy-of-obc-toc"
It happens because the HTML pages listed in the table of contents don't have any content. Instead the javascript with each HTML page makes REST calls to get the content text and adds that into the page body. A new parser will be needed to examine the HTML, make the required REST calls and compose the content text. The LNMTL and GravityTales sites do something similar for the chapters making up a novel.
Ok, I'll add it to the list above and wait. I see there's another unchecked entry on it, did I put it there? I don't even remember adding all those parser to the list! XD Anyaway, why do they make a page so complicated? Isn't it bad for the server?
I see there's another unchecked entry on it, did I put it there?
I assume you mean unlimitednovelfailures.mangamatters.com. I think it's one you requested. It's also unlikely to ever be done, due to fact that:
I don't even remember adding all those parser to the list!
You didn't. I've added many of them as people have requested additional parsers.
why do they make a page so complicated? Isn't it bad for the server?
Mmmm, I may have accidentally deleted an entry, "q...-something". I'll see if I can fix that, sorry.
About unlimitednovelfailures.mangamatters.com: yes, I was the one asking for it. Right now the default parser can grab it well enough. If you edit the chapters list you can have a perfect epub with a toc at the start; I'd say it's OK to not work on it, so you can remove the entry if you wish so (or add it to your won't do list).
Mmmm, I may have accidentally deleted an entry, "q...-something".
That was probably me, https://en.qidian.com/ (aka https://www.webnovel.com/) was done but I hadn't marked it off.
Yes, that's it, "quidian" was the one I accidentally deleted. Good to hear there's no damage done.
@typhoon71 I've put together a "proof of concept" parser for Wixsite and checked into Experimental Tab Mode. Please give it a try and let me know how it goes. Points to note:
Seems fine to me; I use it for https://tachibanachinatsu.wixsite.com/tenshitranslations/copy-of-obc-toc, the whole thing (58 chapters) and it works. As you said, some (most of) titles aren't fetched, but at least the chapter title is present in the TOC of the epub. If fetching the title from the page proves not feasible, you could just grab it from the TOC instead.
@typhoon71 Of course I could do it that way, but that's cheating. On the other hand, it's easy to do and there's only one story you're interested in. So, Done. Enjoy.
Great, thanks. And again, who said that cheating was bad? XD
who said that cheating was bad?
It doesn't work well for https://tachibanachinatsu.wixsite.com/tenshitranslations/tng-toc Also, the image at https://tachibanachinatsu.wixsite.com/tenshitranslations/vol-1-ch-1-part-3 isn't being picked up either.
Well, that novel (The New Gate) is something I'll read later on, so not a "problem" right now. XD Btw, I didn't even test it. Also, one can read only a limited amount of novels, based on time mostly...
At least some of "The New Gate" is on Baka Tsuki. https://www.baka-tsuki.org/project/index.php?title=The_New_Gate Looks like original source is: https://shintranslations.com/the-new-gate-toc/ and the plug-in has no trouble with that site. (Not really surprising, looks like it's Wordpress based.)
Well, in the end I did investigate for sources of "The New Gate" and I'm already reading the second volume from Baka-Tsuki (made with webtoepub). There is Shin TL as a source too. I think the stuff on wixsite may be different (WN/LN) or a duplication (sometime ppl "steal" TLs too).
I noticed that when you grab from "https://shintranslations.com/the-new-gate-toc/" the "Illustrations" page doesn't contain the full sized images, I think it's because there's yet another level/link to them. Can you please fix it? Thanks.
@typhoon71 Can you be more specific?
As far as I can tell, the full size images are obtained for Volume 1.
Oh, you're right, the image that are grabbed are indeed the full sized ones, my bad.
The issue the is that webtoepub seems to tries to "reconstruct" the page layout (the "grid" of images), scaling the images to a really smal size at view time (calibre).
It happens with all of the links for the illustrations galleries from "https://shintranslations.com/the-new-gate-toc/", for example "https://shintranslations.com/vol-3-illustrations/".
Cheching the epub I found out that there's code like this:
which I think is the cause.
I think it would be better to have them grabbed and formatted as usual, 1 image per page.
Hi, can I ask for a parser to be created for this new site, http://novelfull.com/ I like it cause it seems to include a lot of chapter titles that other sites don't bother with. Element seems to be div id="chapter-content". Please remember to include all the chapters (over multiple pages).
Also, for those interested, how about one for Google Translate? https://translate.google.com/translate?hl=en&sl=auto&tl=en&u=http%3A%2F%2Fwayback.archive.org%2Fweb%2F20130318180132%2Fhttp%3A%2F%2Fncode.syosetu.com%2Fn4353bc%2F
Thanks, all, especially dteviot!
@Maradar Have added to branch https://github.com/dteviot/WebToEpub/tree/ExperimentalTabMode, if you'd like to load from source and try it.
@dteviot Thanks it works :-D
Hello,
Can I ask for parser for this page: http://www.wuxiaworld.co/
The site looks similar to http://novelfull.com/
Thanks
@KiraYamatoSD I have added to branch https://github.com/dteviot/WebToEpub/tree/ExperimentalTabMode, if you'd like to load from source and try it. (See https://github.com/dteviot/WebToEpub/tree/ExperimentalTabMode for instructions for loading from source.) If you need additional parsers, please read this first: https://dteviot.github.io/Projects/webToEpub_CustomizingParserTemplate.html
Hi, if it's not too much of a bother can I ask for a parser for this site? https://xuxunette.wixsite.com/danmeitranslations/faraway-wanderers
Thanks in advance and sorry for bothering you.
@ccrowles, Sorry, I missed this request. It's been years since this issue has been updated. It's better to create a new Issue for each site. Makes traceability easier.
@ccrowles,
I had a quick look (30 minutes) at faraway-wanderers.
But I'm sorry to say, it's not an easy one to do.
As you might guess, as there's a request for https://tachibanachinatsu.wixsite.com/tenshitranslations/copy-of-obc-toc at the top of this issue, from 6 years ago that I still have not done.
I see, thanks anyways!
Since I can't do them myself, I'll shamelessly ask for some new parsers to be added.
moonbunnycafe.com -> This is a site where novel translations are hosted with TLs "autorization", and there are a bunch of nice ones; it would make a good addiction.
nanodesutranslations.wordpress.com, or better www.thetranslation.wordpress.com (note the "") -> TL, I think this was already asked talked about, but I'll put here since there's a lot of content there.
krytykal.org -> TL, good stuff unlimitednovelfailures.mangamatters.com -> TL, good stuff (but slow updates)
There are lot of other sites I'd like to ask, but I'll refrain since they mostly are 1 project TL each site and it would be a lot of work for little gain.
/me reloads shame concept /me blushes