DreamCobbler / fiction-dl

A content downloader, capable of retrieving works of (fan)fiction from the web and saving them in a few common file formats.
GNU General Public License v3.0
45 stars 2 forks source link

Nifty problem #16

Closed 3gf9veEjgTe closed 3 years ago

3gf9veEjgTe commented 3 years ago

Most Nifty links do not work for me. It returns:

Scanning the story... ERROR:root:Failed to read metadata from the first chapter of the story. ERROR:root:Failed to scan the story.

OR

Scanning the story... ERROR:root:List of chapters not found. ERROR:root:Failed to scan the story.

My experience is under the same circumstances , some links always work , most just do not work.

DreamCobbler commented 3 years ago

I've partially fixed the issue.

The problem with Nifty is that it has no standard story format, i.e. the metadata of the story don't have a guaranteed structure. Some - most - stories have three lines at the start, in the following format:

Date: Tue, 12 Jul 2011 08:35:17 +0200

From: Amy Redek adultreading@gmail.com

Subject: Guinea Pig II. Chapter Seven

(Example coming from the Guinea Pig story.)

Date in the first line, author's name and e-mail in the next, the title in the third.

Some other stories use a similar format, but without the author's name, like that:

Date: Thu, 12 Dec 2002 18:48:18 -0500

From: boss1@optonline.net

Subject: Locker Room: Part 5

(Example from Locker Room.)

Some other stories seem to ignore this format altogether.

There are also HTML-formatted stories, some of which don't provide any metadata (which f-dl simply requires to have), or don't provide them in a consistent way.

So, in short, the support for Nifty is and will be rather shaky. I've implemented support for a wider range of author fields (i.e. just a name or just an e-mail), so some stories that weren't downloadable before should be processed just fine in 1.8.3. This ought to fix some instances of the first error.

The second error should be fixed in some cases as well. Tables of content of some stories didn't match the expected layout; I've expanded the range of acceptable formats, i.e. f-dl now looks in a few different places of the tag soup for a list of chapters. Still, I can't quite guarantee there will be no issues like that in the future.