fmd-project-team / FMD

The new FMD fork! Join our community on Discord!
https://discord.gg/cXKKgw3
GNU General Public License v2.0
263 stars 33 forks source link

Can't Download Mangas [TumangaOnline] #71

Closed megumixd closed 5 years ago

megumixd commented 5 years ago

hey i have some problems downloading mangas in tmo i dont know if is my computer or the page but i hope you can check the lua please i will appreciate it that thanks oh im using the new fmd

Tmp341 commented 5 years ago

https://github.com/fmd-project-team/FMD/wiki/CF-Workaround:-LHTranslation

Did something similar above:

__cfduid=xxxxxxxxxxxxxxxx; tmoR=xxxxxxxxxxxxxxxxxxxxx; tumangaonline_session=xxxxxxxxxxxxxxxxxxxxxxxxxxxxx; XSRF-TOKEN=xxxxxxxxxxxxxxxxxxxx

Remove = at the end of each line.

Don't forget to get User Agent info and use it in your FMD.

Tmp341 commented 5 years ago

@megumixd If the workaround I've explained works for you, please close this issue.

ZeroCool940711 commented 5 years ago

@Tmp341 Hi there, im having this problem on TuMangaOnline too, I've tried the workaround you mentioned before but nothing happened, I double-checked everything and I noticed that FMD can "see" the content on TuMangaOnline and it is getting the number of chapters and creating the folders for them but no content is been downloaded.

Tmp341 commented 5 years ago

@Tmp341 Hi there, im having this problem on TuMangaOnline too, I've tried the workaround you mentioned before but nothing happened, I double-checked everything and I noticed that FMD can "see" the content on TuMangaOnline and it is getting the number of chapters and creating the folders for them but no content is been downloaded.

sss1

I've just selected a random manga and downloaded. It looks like the workaround works.

ZeroCool940711 commented 5 years ago

@Tmp341 I have a question, can you add a small script on Python to FMD to bypass Cloudflare on TuMangaOnline and also other sites? I've used cfscrape on python to successfully bypass cloudflare many times and the problem we are having on FMD is that the websites are automatically blocking FMD and our IP sometimes from accessing the website, im not sure if you know this but the session you reuse by using the cookies you copied from the browser using your workaround only last a few minutes, im not sure about how long but they are not something permanent, by using cfscrape you could just send a simple request to the website to get the cloudflare session and pass those values to the rest of FMD automatically, if the previous session expired then you just have to send a new request to get a new session.

The script I propose to use will look like this:

import cfscrape

scraper = cfscrape.create_scraper()  # returns a CloudflareScraper instance
tokens, user_agent = cfscrape.get_tokens('http://tmofans.com/')
# print the cloudflare tokens and user_agent used
print tokens, user_agent

That will return something like this

{'cf_clearance': '', '__cfduid': 'df630feed2d850352ad842d0ec62e9a3b1554955364'} Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Ubuntu Chromium/65.0.3325.181 Chrome/65.0.3325.181 Safari/537.36
ZeroCool940711 commented 5 years ago

Also Cfscrape can return the whole content of the page if needed.

Tmp341 commented 5 years ago

I don't know anything about coding. Let's ask @SDXC

ralvarador commented 5 years ago

Pretty interesting... I did this workaround but to mangadex, and it seem that have worked. I copied all the cookies generated by the website (including the CF's), but I was using a different User Agent String ( 🤦‍♂️ ). Once I changed the User Agent String to the one from the web-browser used to get the cookies... it started to be able "see" the chapters. BTW, I've set the "MaxConnectionLimit" to 1 for Mangadex.

SDXC commented 5 years ago

@ZeroCool940711 I'm currently trying to implement it directly in Pascal, but need a bit of time. Using an external solver would be a backup plan.

SDXC commented 5 years ago

@ZeroCool940711 Are you fluent in Python by chance? Would it be possible to prepare an absolutely most slim package of python (embeddable, not setup or overloaded content, remove all unneeded scripts/libraries/tools) including the scraper script? And an example command line call for the script.

ZeroCool940711 commented 5 years ago

@SDXC I wouldnt say im a Python expert but Ive been using it for quite some time so I know my way around it, what do you need? an executable that you could call from the command line and get the content of CF cookies easily?

I can do that. I can use py2exe, pyinstaller or some other library to create a single file exe with everything you need to run the script.

ZeroCool940711 commented 5 years ago

@SDXC I can make it so you can use the same script for any site or URL, for example, it can take the URL as a command line argument and give you back the CF information for the site so you can use it on the FMD app as you like.

ZeroCool940711 commented 5 years ago

@SDXC Here is the script. I created a repo which has 5 files if we count the Readme.md:

The script receives a single url as a console argument, if it doesnt have http or https on it in case you forgot to pass those it will add them to the url, then it creates a cfscrape session check the url to get the CF token and user agent and then print them to the console output for you.

SDXC commented 5 years ago

@ZeroCool940711 Hey thanks. I really appreciate your effort. I will test it today evening. If it works I will start implementing it into FMD.

It seems that CF changed their challenge this month and many bypass scripts/tools won't work anymore and some of them aren't maintained anymore as well >_> Hope cfscrape is one of the working scripts ^^

SDXC commented 5 years ago

@ZeroCool940711 I tested it by now. The original script is broken, but someone made a working fork: https://github.com/lukele/cloudflare-scrape/blob/update-challenge-solver/cfscrape/__init__.py

Could you recompile your exe with the fix?

ZeroCool940711 commented 5 years ago

Working on an update for the script, give me a few minutes and I will have the changes on the repository.

ZeroCool940711 commented 5 years ago

@SDXC Done, Ive updated the code with a fork of CFscraper that works, keep in mind that some times TMOFans.com (tumangaonline.com) is not using cloudflare so when its not using it the script will just use requests to get the content instead of cloudscraper which is the new fork of cfscrape im using, when that happens the cf_clearance field will be empty. Test it and tell me if I need to update anything else.

SDXC commented 5 years ago

@ZeroCool940711 Doesn't seem to work:

Traceback (most recent call last): File "cfscraper.py", line 5, in File "site-packages\cloudscraper__init.py", line 230, in create_scraper File "site-packages\cloudscraper__init.py", line 71, in init__ File "site-packages\cloudscraper\user_agent__init.py", line 17, in init__ File "site-packages\cloudscraper\user_agent\init__.py", line 25, in loadUserAgent IOError: [Errno 2] No such file or directory: 'C:\Users\MICHEL~1\AppData\Local\Temp\_MEI51~1\cloudscraper\user_agent\browsers.json' [4712] Failed to execute script cfscraper

ZeroCool940711 commented 5 years ago

That's weird, it worked for me, let me double check.

SDXC commented 5 years ago

I just dumped the exe together with node.exe, opened cmd, changed to the directory where the exe is located and tried this:

cfscraper.exe https://mangadex.org

That worked at least a bit better with the old build...

ZeroCool940711 commented 5 years ago

Seems like the problem is with the exe, the script works without any problem, for what Ive seen its that it cant find the user_agent list.

SDXC commented 5 years ago

I used https://github.com/lukele/cloudflare-scrape/blob/update-challenge-solver/cfscrape/__init__.py for now, which works fine in most cases. I compiled it with your script and it was done successfully. Thanks again.

Now, I will implement some mechanics to use the bypassing script in FMD at a very basic level: If FMD encounters a CF page, it will run the script and put the cookies in a variable and passes them to the following routines. I'd say in most cases that would already be enough. For the rest it will be necessary to still use the workaround (psychoplay for example).

ralvarador commented 5 years ago

Time ago, I was using this project: https://github.com/7ouma/CrunchyManga The coder also had problems with py2exe and cfscrape; because of that was necessary install phyton and Node.js in order to run the Script instead of an .EXE. The project is not maintained anymore.

SDXC commented 5 years ago

cfscrape (in our case it will be called cf_bypass) will also be an exe made out of python and node.exe will be needed as well. But only these 2 exe files are needed. For now it works pretty well.

ChocolateOtaku commented 5 years ago

You don't need python, you may use node and a corresponding script based on e.g. cloudscraper to get this done. I'm pretty sure someone could figure out how to tie things together ...