dteviot / WebToEpub

A simple Chrome (and Firefox) Extension that converts Web Novels (and other web pages) into an EPUB.
Other
678 stars 132 forks source link

Site Changed Layout [https://lazygirltranslations.com/] #1440

Open plsbenice-immadeofjelly opened 2 weeks ago

plsbenice-immadeofjelly commented 2 weeks ago

Hello! Firstly, thank you for all your efforts to maintain this extension, it's very much appreciative as a daily user <3 I'd like to request for an update regarding the parser of [https://lazygirltranslations.com/] because the site has recently revamped their entire site and now uses the same theme as KnoxT.

When updating this parser, if it's not too much trouble, can I ask that the exact same function for this site is used alongside KnoxT, so that any sub-headings inserted directly under the main heading will still be pulled into the outputted epub? Since the parser for KnoxT seemed to have addressed that issue.

Also, some Table of Contents on this source still remain the same as it was previously formatted. It's unclear whether this is an error or otherwise, but just for reference, I included two different TOC pages, where the first link has the new layout of the TOC and the second link remains with the old format, just so the updated parser can maybe include both different formats if Webtoepub fails to register one but the other and vice versa.

New Format: https://lazygirltranslations.com/series/rebirth-of-a-star-general/

Previous Format (this one still has the grabbable TOC tho webtoepub doesn't register it anymore): https://lazygirltranslations.com/the-whole-cultivation-world-wants-to-revive-the-venerable/

Thank you again for all your help :)

dteviot commented 2 weeks ago

@plsbenice-immadeofjelly

I don't think WebToEpub had a parser for https://lazygirltranslations.com. Were you using the default parser? That said, I just tried WebToEpub against the second link, and it was able to recognize the layout and parse it without me needing to do anything.

Trying the other link, WebToEpub fell back to the default parser. Of course, if you set the default parser, it won't do the auto search for matching layout.

Tested with:

Time taken: 32 minutes

plsbenice-immadeofjelly commented 2 weeks ago

OH! Yes, sorry! I think Webtoepub automatically registered the site's Wordpress layout, so I don't think it did have an individual parser then. If it's not too much trouble, would it be possible to add this site to the same Noblemtlparser as KnoxT then please, with the same outputs regarding sub-heading? Since the source now immediately loads the Default Parser, given the change in site layout :'(

Though, the site still seems to use the two different formats (at present), I'm not sure if it's feasible to account for both formats so that Webtoepub can immediately recognise the other one if the parsing of the first layout fails?

dteviot commented 2 weeks ago

@plsbenice-immadeofjelly

I'll ask you to do me a favor. Please

  1. Go to the Advanced Options in WebToEpub,
  2. Click on "Write Options to file"
  3. Email me the resulting file, to dteviot@gmail.com. Do NOT attach it to this issue. (File will record the sites you've set up the default browser for. Which might be a bit personal to you. Note, if you're not comfortable with me seeing it. That's OK. I'll figure out another way.)

That will help me see what's going on.

plsbenice-immadeofjelly commented 2 weeks ago

@dteviot

Hello! Not a problem at all if it helps you out :) I've sent you the outputted file to your email~ Thank you xx

dteviot commented 2 weeks ago

@plsbenice-immadeofjelly

Test versions for Firefox and Chrome have been uploaded to https://github.com/dteviot/WebToEpub/releases/tag/developer-build. Pick the one suitable for you, follow the "How to install from Source (for people who are not developers)" instructions at https://github.com/dteviot/WebToEpub/tree/ExperimentalTabMode#user-content-how-to-install-from-source-for-people-who-are-not-developers and let me know how it goes. Tested with:

Time taken: 80 minutes.

plsbenice-immadeofjelly commented 2 weeks ago

Perfect fix that addressed all the issues listed! Even going beyond it to remove the excess elements I used the Default Parser for haha, thank you very much~ :)

plsbenice-immadeofjelly commented 2 weeks ago

OH, I'm sorry! There seems to be a little issue more with generating the "Information" page, it only worked for the first TOC link (with the new format), but with the TOC bearing the old format, it only generated an empty "Information" page, possibly due to using different content selectors in the table of contents page. Is it possible to factor in both formats for the information page too please? Apologies for all the inconvenience I've caused, thank you for your assistance!

dteviot commented 2 weeks ago

@plsbenice-immadeofjelly

Is it possible to factor in both formats for the information page

Yes, https://lazygirltranslations.com/the-whole-cultivation-world-wants-to-revive-the-venerable/ doesn't provide an Information page. However, I spent over an hour on this, just to get the fetching the ToC links and fetching chapter content working for both formats. I'm not willing to spend more time for a single story. You can add the information yourself. Although if you can find additional stories that are also broken, I'm willing to reconsider.

plsbenice-immadeofjelly commented 2 weeks ago

That's completely understandable, apologies for not considering the inconvenience this second ask would've caused you further, regardless, I do really appreciate all the effort you've put in thus far, thank you greatly~ :)