p0ody / ff2ebook

WIP.
http://www.ff2ebook.com
18 stars 2 forks source link

Can't use ff2ebook.com to download new chapters #37

Closed dublinshane closed 2 years ago

dublinshane commented 3 years ago

just tried to download 10 new chapters of "Death's Little Brother" but ff2ebook.com just downloaded the original story (75 Chapters). Anyone know if why this is happening and when it might be resolved?

bastien8060 commented 3 years ago

are you trying to use FanFiction.net? Or FictionPress?

bastien8060 commented 3 years ago

Since FanFiction.net stopped working with ff2ebook, you only got an old version of the story. Do you want to try this? https://ff2.theyoungappy.com

Be mindful that >40 chapters is a lot for this workaround. It will take a while.

StarWolf3000 commented 3 years ago

Your server is currently responding with a 503 error.

StarWolf3000 commented 3 years ago

And when the server works, the script doesn't. It throws the same errors as ff2ebook (all kinds of missing metadata and then stuck at retrieving fiction chapters) or gets stuck at retrieving that metadata. FFN btw.

bastien8060 commented 3 years ago

Yes my server is currently down, as I'm changing my host. I stopped working 3 days ago btw.

bastien8060 commented 3 years ago

also on old server: Read-only disk -> Corrupted ext4 partition

Edit: It seems for some reasons it works again, I'll fix the script

StarWolf3000 commented 3 years ago

In the meantime, I've been successfully using https://fichub.net

bastien8060 commented 3 years ago

In the meantime, I've been successfully using https://fichub.net

Would you give it a try in an hour? I changed something, i need to git blame 😅

StarWolf3000 commented 3 years ago

Would you give it a try in an hour? I changed something, i need to git blame

Sure.

bastien8060 commented 3 years ago

The error is because I fixed a security issue

bastien8060 commented 3 years ago

There was a vuln where you could pass arbitrary commands to bash as you gave the url to the python script

bastien8060 commented 3 years ago

If the url contained a command Its now fixed, but the url doesn't get through?

StarWolf3000 commented 3 years ago

Should I now wait an hour or is it already live? If it's already live, then it doesn't work.

bastien8060 commented 3 years ago

https://ff2.theyoungappy.com/

bastien8060 commented 3 years ago

It works now!

bastien8060 commented 3 years ago

The error was stupid, I forgot I was passing the full url into the function getpagesource(); like this $this->getPageSource($fullurl=$this->getURL());.

It was a test and I forgot to remove it

The content the script was retrieving was an error code, saying chapter did not exist:

https://www. https://www.fanfiction.net/s/7564597/1/Lifes-what-you-make-it /1 <

Screenshot_2021-02-15_11-11-02

As you can see, the Api is not really fast

StarWolf3000 commented 3 years ago

Yes, when it connected, the returned error was that no chapter text was found. Also what is your current timeout setting for a connection to get a chapter and what is your total script timeout setting? As stories usually get longer of the course, the script also has to retrieve more chapters, increasing the total execution time.

The largest one I have in my library has 144 chapters at almost 1 million words.

bastien8060 commented 3 years ago

Did it work?

bastien8060 commented 3 years ago

For the timeout, yes and no I could make use of http/2

bastien8060 commented 3 years ago

Http/2 brings asynchronous calls, meaning one call doesn't block the other, so it much faster. They would all end at the same time

StarWolf3000 commented 3 years ago

Took a few minutes, but it worked. Some chapters were retrieved really slow, so the progress didn't react.

bastien8060 commented 3 years ago

I have an https/3 backend in a test server. Can you give me an url with a looot of chapters?

I can record a HAR file of the api calls. Eg. the time it took, response etc...

bastien8060 commented 3 years ago

You can then import it in chrome so you can see how fast it is and how much of a difference it makes

StarWolf3000 commented 3 years ago

https://www.fanfiction.net/s/8798317/1/The-Console-Wars (the one with 144 chapters)

I also attached my own log for this url: 8798317.ff2.har.zip

bastien8060 commented 3 years ago

Hey! Just read finished trying to fetch your long story.

bastien8060 commented 3 years ago

I'm gonna attach the HAR anyway: ff2.har.log

bastien8060 commented 3 years ago

Here is the HAR for a smaller FanFic 18Chapters/350k words: smaller-ff2.har.log

On the smaller fanfic:

There is a small error rate however, and I cannot find the source of the issue. It is as if preg_match() randomly failed.

When that happens, it :

I tried to output $source and $url when it fails but everything is correct. If you have any idea, please do share :)

bastien8060 commented 3 years ago

https://www.fanfiction.net/s/8798317/1/The-Console-Wars (the one with 144 chapters)

I also attached my own log for this url: 8798317.ff2.har.zip

@StarWolf3000, Based on your log, http/3 beat http/1 by 2min. Ie, yours took 3.4min, and http/3 took 1.4min

jmsundar commented 3 years ago

“An error has occurred” when trying to download MOBI file. ePub worked fine.

iridescent-beacon commented 3 years ago

Hi @bastien8060, @p0ody et al; your friendly neighbor over at fichub.net here! Noticed this particular github issue showing up in my referrers log and thought I'd drop by and offer a hand. There's a discord of ff related programmers that I can send you an invite to, and you're more than welcome to join fichub's discord (link on the homepage) or contact me directly.

Most of us that have worked around the current FFN/CF issue use https://github.com/FlareSolverr/FlareSolverr -- and I've setup a proxy based on that which I'm more than happy to share if you want something more drop-in or to pool resources.

Please let me know if there's anything I can help with or if you have any questions! We're in a niche but friendly community :)

bastien8060 commented 3 years ago

@iridescent-beacon Sure i'd be interested to check. I've been quite busy recently, but I'm still looking every so often at reverse engineering fanfiction.net app. It is very fast, and they have 10 different endpoints. The communication is end-end encrypted with AES-128 with a 16bytes hardcoded IV + key. It takes their mobile app 1sec or less to download a fanfiction. The http authentification key to their api hasn't change for the last 3 years.

bastien8060 commented 3 years ago

If anyone is interested to look at the results I have gotten + want to try write a custom http client to use this end-to-end encryption, I'd be more than happy to share what I have. The api communication is encrypted on the client side + decrypted with the same key on the backend.

bastien8060 commented 3 years ago

I've been experimenting too with the python modules: cfscrape + cloudscrape but these need proxies to be efficient in term of bandwidth. The issue is that it makes the script slow and I don't have paid proxies, and that you need to keep a proxy list updated. Your proxy implementation with FlareSolverr is a good idea though.

iridescent-beacon commented 3 years ago

Great! I realized as I was typing the original message that github doesn't have direct messaging, or at least not that I could find, so I've just sent you an email with the invite to the email listed on your profile. It's been a while since I've fiddled with getting my emails accepted by gmail though so if you don't see it right away please check your spam. Or feel free to join fichub's and I can send you the invite there.

As I mention in the email I believe there may be some other people there poking at the ffn app.

cloudscrape worked for me briefly, and apparently still works for some people. At one point it said only the paid version would work, but none of us could find the paid version so we've largely moved on. What sort of request rate to FFN does ff2ebook need to do, if you don't mind sharing?

bastien8060 commented 3 years ago

That would be @p0ody to ask, though ff2ebook got a subsequent amount of downloads at some point. (~90GB gzip archive of epubs with 2019 alone).

bastien8060 commented 3 years ago

Thanks too for the email, I will be looking at this in about an hour or so. I'm finishing a quick work with setting up a docker image (a project of my own), because Alpine Linux doesn't support Python's wheel images. This means I have to compile all my dependencies from source :)

bastien8060 commented 3 years ago

@iridescent-beacon Hey sorry, I doubled check my email, but then realized the email on my github is outdated. I updated it, would you mind forwarding your email to my new one? Many thanks :)

iridescent-beacon commented 3 years ago

@bastien8060 no problem; forwarded to your updated email :)