Open SinD3825 opened 4 months ago
@SinD3825
Test versions for Firefox and Chrome have been uploaded to https://drive.google.com/drive/folders/1B_X2WcsaI_eg9yA-5bHJb8VeTZGKExl8?usp=sharing. Pick the one suitable for you, follow the "How to install from Source (for people who are not developers)" instructions at https://github.com/dteviot/WebToEpub/tree/ExperimentalTabMode#user-content-how-to-install-from-source-for-people-who-are-not-developers and let me know how it goes. Tested with:
Please note:
For my notes: 59 minutes work (Fixing formatting and removing garbage proved more difficult than usual.)
I tried out the chrome version and tested it on a novel I bought chapters of. On jjwxc, the first ten or so chapters are free and then are "VIP" afterwards (there's a red [VIP] next to each chapter on the title page), where you have to pay to access them. In the case of this novel, the paid chapters start at 20 and goes till the end. It works really well until I hit the paid chapters, where it looks like a similar issue of everything showing up as specials unicode block, but it also looks like the chapter itself isn't there, just the header and footer. Here's a portion of what it looks like as a screenshot, I've also attached the complete chapter as a zipped xhtml file that can be viewed in browser. This happens for all paid chapters (chapter 20-the end), I've only included the result of the first paid chapter.
I'm not sure if it's useful, but I've also attached the errors that popped up after I tried to epub the entire novel: webtoepub_errors.pdf
In terms of stripping the header and footer, the header is perfect and starts where it should, but it looks like the author notes get cut off from the footer. Is there any way that those could be included? Just a heads up that they aren't in every chapter, so if that's too much of a hassle I don't mind having components in the footer that aren't supposed to be there. They're the green text after the horizontal rule and before the white channel banner (example of bottom of chapter one):
Everything else looks great! Thank you so much for your hard work! Please let me know if you need anything, I can send my jjwxc login details if you need to access the full vip chapters. Also, would you like me to start this as a new issue?
Sorry I think I forgot to attach the file. Here's the chapter 20 screenshot and the actual zipped file.
@SinD3825
The author notes should now be included in the test build. Please try and let me know. Note, I'm not sure I've got the title correct. (If not, please provide correct text.)
The zipped xhtml file is of no use to me. What I need is the raw HTML of the chapter that the site sends. In other words, I need a HAR file. Basic steps:
Warning, it might be a couple of weeks before I have time to properly examine it.
Tested with
For my notes: 55 minutes (extra)
Hi there! This is a continuation of this previous issue.
Provide URL for web page that contains Table of Contents (list of chapters) of a typical story on the site: https://www.jjwxc.net/onebook.php?novelid=5126430
Did you try using the Default Parser for the site? If not, why not?
Instructions for using the default parser can be found at https://dteviot.github.io/Projects/webToEpub_DefaultParser.html I did try using the default parser, but what showed up for me in the test window was a combo of mostly specials unicode block and a bunch of different languages (armenian, korean, arabic, random fractions, etc.).
What settings did you use? What didn't work?
I just wanted to confirm that all chapters of the story are on a single table of contents page (ex: https://www.jjwxc.net/onebook.php?novelid=5126430). Regarding whether the site is using JSON for chapter content, I dug around in developer tools and there was a JSON script, but the area that was highlighted looked like it was only for user log in's/other site stuff and not actually for holding chapter content.
I also have no preference on extending the default parser to put in the encoding, please do whatever you think is best. Again, really appreciate your hard work and thanks for doing all this!
If the Default Parser did not work, if you have developer skills, did you try writing a new parser?
Instructions https://dteviot.github.io/Projects/webToEpub_FAQ.html#write-parser I tried looking through the FAQ's but it didn't mention anything about encoding, so I didn't really know how to cobble anything together. I don't have any coding experience in HTML/javascript, I'm sorry :(
If you don't have developer skills, can you ask a friend who does have them if they can do it for you?
N/A
If you tried writing a parser, and it doesn't work. Attach the parser here.
N/A