JimmXinu / FanFicFare

FanFicFare is a tool for making eBooks from stories on fanfiction and other web sites.
Other
746 stars 158 forks source link

Deviantart Grab from webpage support #895

Closed kolbdog3333 closed 1 year ago

kolbdog3333 commented 1 year ago

@jcotton42 Can you please fix it so you can grab all URLs from the webpage instead of each one individually? It is very annoying to do it that way. Instead of just make an anthology, could you add it to make an anthology from the webpage for the gallery pages?

jcotton42 commented 1 year ago

Does the existing URLs from webpage function not work? My understanding is that it was generic and should work for any webpage, but I'll confess that I didn't test it.

The reason I didn't test/implement gallery support is because the order is often a mess, and whether or not a gallery is ordered by newest first or oldest first tends to vary a lot.

If it doesn't work at all though then give me a test gallery, and the behavior you'd expect from supplying it to the "URLs from webpage" function, and I can look into it this weekend.

kolbdog3333 commented 1 year ago

@jcotton42 The pages work, but I can't grab the pages from the gallery. I can grab them individually, here is a link to the page I am trying to download: https://www.deviantart.com/jtom09/gallery/79840819/reacting-to-the-loud-house Also I can change the order myself, I just don't like copying and pasting each link. Download them all individually,

jcotton42 commented 1 year ago

It breaks because we're getting a 403 from CloudFront

> curl 'https://www.deviantart.com/jtom09/gallery/79840819/reacting-to-the-loud-house'
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<HTML><HEAD><META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=iso-8859-1">
<TITLE>ERROR: The request could not be satisfied</TITLE>
</HEAD><BODY>
<H1>403 ERROR</H1>
<H2>The request could not be satisfied.</H2>
<HR noshade size="1px">
Request blocked.
We can't connect to the server for this app or website at this time. There might be too much traffic or a configuration error. Try again later, or contact the app or website owner.
<BR clear="all">
If you provide content to customers through CloudFront, you can find steps to troubleshoot and help prevent this error by reviewing the CloudFront documentation.
<BR clear="all">
<HR noshade size="1px">
<PRE>
Generated by cloudfront (CloudFront)
Request ID: 18vR6DRie_K7Z6J2pbDIgRq_rCwm8ERaunT8luqk0DRsoDrdO01J7A==
</PRE>
<ADDRESS>
</ADDRESS>
</BODY></HTML>

This would possibly be fixable by enabling the browser cache feature for deviantart. However (and @JimmXinu will have to comment on this, as I'm not entirely sure), my understanding is the "Get Story URLs from Web Page" feature doesn't have any way of "following" additional pages. So even if this was fixed by enabling browser cache, or some other means, it would still only pull the URLs from one of the five gallery pages. And you would still have to then manually rearrange the chapters to be in the correct order, as this gallery is sorted most recent first.

@kolbdog3333 what I would do, at least in the interim, is enable "Keep 'Add New from URL(s)' dialog on top?" in FanFicare's settings > Basic tab > GUI Options section. Then you can simply drag and drop the links from the webpage into the dialog box. No need to manually copy/paste, or insert blank lines yourself. This works in all the download dialogs, including "Make Anthology Epub from URLs" under Anthology Options. That won't be as seamless as you want, but it should at least be less tedious.

JimmXinu commented 1 year ago

The Browser cache feature doesn't work with "Get Story URLs from Web Page". And it only works with a very few sites at all, mainly ffnet. I don't really have any desire to expand the Browser cache feature to other sites.

"Get Story URLs from Web Page" only works on one page. This is a deliberate design decision so it a) works fairly universally, you can point it at pages on non-supported sites; and b) isn't abused to download lengthy lists / whole sites.

Ideally, the deviantart adapter would be able to look at a chapter list somewhere instead of kludging it with the anthology feature. But as far as I know, deviantart doesn't have a true multi-chapter story feature and people are basically just faking it with individual posts.

jcotton42 commented 1 year ago

But as far as I know, deviantart doesn't have a true multi-chapter story feature and people are basically just faking it with individual posts.

Correct, dA's support for text posts is barely an afterthought. There's not even a standardized way to link to the next chapter in a post, posters just add links to the description or text body.

Support for traversing a gallery automatically would be possible, but I'd imagine quite brittle. At the very least you'd need to account for the sorting direction somehow. It's also not unusual (in my experience) for authors to drop things like character info or refs in the middle of the chapter gallery, so you'd need the ability to weed out "chapters" anyhow.

JimmXinu commented 1 year ago

That's what I thought, but I'd hoped @jcotton42 might have more ideas based on knowing more about the site. Thanks for looking into it.

So I'm going to close this issue as infeasible at this time.