bitbybyte / fantiadl

Download posts and media from Fantia
MIT License
299 stars 51 forks source link

Stopped working at some point over the past few days - Closes immediately #103

Closed Aaeeschylus closed 1 year ago

Aaeeschylus commented 1 year ago

I would like to start by saying I have been using this for quite some time and it has been a great help, so thanks!

However, at some point over the past couple of days, it appears to have stopped working. For months, I had been using the following: fantiadl_v1.8.1.exe -c cookies.txt -p -t -r -m Whenever the cookies expire, I get new ones via the extension on firefox. However, when I updated the cookies today, whenever I run the above, it would finish immediately. No errors, no messages, just closes.

I tried updating to 1.8.2 and the same thing happened. I tried deleting the cookies file, clearing cookies in firefox, logging back into fantia in firefox and getting a fresh cookies file. This time, it waited for roughly 30 seconds, dumped a timeout error and whenever I run it again, it is back to the closing immediately problem. I have only seen it time out once and haven't been able to get it to happen again (otherwise I would have pasted it here).

I am not sure if this is being caused by something on my end, or if fantia has changed something recently, but any assistance would be greatly appreciated.

bitbybyte commented 1 year ago

Still works here on a fresh login and since you mentioned this has been ongoing for the past few days, I would have expected to see way more reports if this were widespread.

I would try logging out and in again, using the Python package instead of executable, and also verifying your connection (VPN, proxies, firewall, similar). tracert may help here. Timeout would suggest you can't resolve to Fantia.

Aaeeschylus commented 1 year ago

Sadly I have tried refreshing cookies with relogs. I thought it might have been a routing issue as well due to the timeout however I can access it without issue on any browser and any device in my house so I am doubtful that is the problem. The timeout issue also only occurred once and apparently there was some maintenance going on at my ISP yesterday so that could have been the cause.

I did also do what you said about trying the python package and now I am even more confused. I tried running it with a cookies file and it would keep failing, stating invalid session, even though it is a brand new cookies file from my current logged in session. So, I tried using just the session ID manually instead and all this did was make me more confused.

By using the session ID I copied, the scripts actually ran however it resulting in the following: Collecting paid fanclubs... Collected 0 fanclubs. I am definitely paying a membership so this does not make sense. I did then compare the session ID to the one in the cookies file and they are exactly the same. So using the session ID on its own, I apparently am not in a fanclub and using the cookies file with the exact same session ID is invalid. I did notice that the session ID in the cookies file has a #HttpOnly_ at the start.

#HttpOnly_fantia.jp FALSE / TRUE ......................

Removing the #HttpOnly_ makes the cookies file work again however it results in the same "0 fanclubs" issue.

Do you have any thought on what could be going on here?

bitbybyte commented 1 year ago

What method do you use to login to Fantia (third party sign on, email) and from what browser? https://www.whatismybrowser.com/detect/what-is-my-user-agent/

What do you see from this page? Do you see https in the resolved URL? https://fantia.jp/mypage/users/plans?type=not_free&page=1

Also, rather than exporting the cookies file, if you can show me what your cookies look like from the browser storage that would probably help (redact the session value):

Mozilla Firefox

    On https://fantia.jp/, press Ctrl + Shift + I to open Developer Tools.
    Select the Storage tab at the top. In the sidebar, select https://fantia.jp/ under the Cookies heading.
    Locate the _session_id cookie name. Click on the value to copy it.

Google Chrome

    On https://fantia.jp/, press Ctrl + Shift + I to open DevTools.
    Select the Application tab at the top. In the sidebar, expand Cookies under the Storage heading and select https://fantia.jp/.
    Locate the _session_id cookie name. Click on the value to copy it.
Aaeeschylus commented 1 year ago

I used email to login via firefox, which is what I have been using since I found fantiadl. Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:109.0) Gecko/20100101 Firefox/109.0

Https is also in the resolved URL (which makes the HttpOnly in the cookies file make less sense).

image Here are the cookies. When I mentioned previously that I tried the Session ID instead of the cookies file, I copied the session ID from firefox, not the cookies file.

bitbybyte commented 1 year ago

But on https://fantia.jp/mypage/users/plans?type=not_free&page=1 do you see plans listed?

Also, are you able to download individual posts or a profile specified manually (rather than using -p)?

Aaeeschylus commented 1 year ago

When I go to that link, the plans are indeed listed there.

When I try to download the latest post from the one account I am subbed to, I get the following:

C:\Temp\Fantia>fantiadl.py -c cookies.txt https://fantia.jp/posts/postID
Downloading post {POST ID}...
Traceback (most recent call last):
  File "C:\Temp\Fantia\fantiadl.py", line 111, in <module>
    downloader.download_post(url_groups[1])
  File "C:\Temp\Fantia\models.py", line 415, in download_post
    response.raise_for_status()
  File "C:\Users\username\AppData\Local\Programs\Python\Python39\lib\site-packages\requests\models.py", line 960, in raise_for_status
    raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 403 Client Error: Forbidden for url: https://fantia.jp/posts/postID

But if I try to download any of the content in fantia of the exact same post myself, it works perfectly fine

bitbybyte commented 1 year ago

This is likely related to #104. I'm closing this as a dupe.

Aaeeschylus commented 1 year ago

This is likely related to #104. I'm closing this as a dupe.

Would just like to bring up that this is working now with the new models.py for links to posts, however it still does what I posted here where it shows 0 paid fanclubs found.

bitbybyte commented 1 year ago

fanclub_links = response_page.select("div.mb-5-children > div:nth-of-type(1) a[href^=\"/fanclubs\"]")

Does this selector find anything on the plans page I listed above? Use devtools and verify the page hierarchy and try to search for that selector.

Aaeeschylus commented 1 year ago

image It definitely is there in firefox.

I printed out the response_page from models.py and there was no reference to either /fanclubs/ID or mb-5-children at all.

bitbybyte commented 1 year ago

We'd need the response output to compare to what's expected and see that it does return the plans page.

Aaeeschylus commented 1 year ago

responsePageOutput.txt Here is the output for response_page. I was going to paste it but its too long.

bitbybyte commented 1 year ago

<meta content="https://fantia.jp/recaptcha" property="og:url"/> You're bumping up against some kind of CAPTCHA, explains why it is specific to you. But I'm not sure what can be done or what would trigger it. It would seem like maybe your account/IP was tagged since you are hitting it on that page consistently.

This plans page is just a simple HTML request, we don't even touch the API here. Maybe we can find a way to put up a browser to extract a CAPTCHA token.

Aaeeschylus commented 1 year ago

<meta content="https://fantia.jp/recaptcha" property="og:url"/> You're bumping up against some kind of CAPTCHA, explains why it is specific to you. But I'm not sure what can be done or what would trigger it. It would seem like maybe your account/IP was tagged since you are hitting it on that page consistently.

This plans page is just a simple HTML request, we don't even touch the API here. Maybe we can find a way to put up a browser to extract a CAPTCHA token.

Any thoughts on how to potentially resolve this? I haven't tried running fantiadl for 2 weeks to maybe see if the captcha requirement will clear itself but it hasn't. Tried VPN too and that did nothing.