prof79 / fansly-downloader-ng

Easy-to-use fansly.com content downloading tool. Written in Python and available as a standalone Windows Executable. Enjoy your Fansly content offline anytime, anywhere in the highest possible content resolution! Fully customizable to download in bulk or single: photos, videos & audio from timeline, messages, collection & single posts.
GNU General Public License v3.0
224 stars 13 forks source link

Fansly countermeasures - auth issues/no media/banning #34

Open lordoffools opened 3 months ago

lordoffools commented 3 months ago

Bug Description

Since being forced to upgrade today to v0.8.19, I am now getting the following error:

Info | 13:55 || Inspecting most recent Timeline cursor ... [CID: xxx]  [36]ERROR | 13:55 || Unexpected error during Timeline download: Traceback (most recent call last):   File "/fansly-ng/download/timeline.py", line 64, in download_timeline     all_media_ids = get_unique_media_ids(timeline)                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^   File "/fansly-ng/download/common.py", line 45, in get_unique_media_ids     account_media_ids = [                         ^   File "/fansly-ng/download/common.py", line 46, in     media['id']     ~^^^^^^ TypeError: 'NoneType' object is not subscriptable

Press to attempt to continue ...

Environment Information

User Research

I have done the following:

lordoffools commented 3 months ago

As a sidenote, can we disable the force upgrade check at the start?

lordoffools commented 3 months ago

For context, I was not able to reproduce this error in 0.8.17.

lordoffools commented 3 months ago

As a sidenote, can we disable the force upgrade check at the start?

Answered my own question by returning True here: https://github.com/prof79/fansly-downloader-ng/blob/dcd5f2930f9bfcbac849d4b7b0065384e1f1ea8b/updater/utils.py#L267

lordoffools commented 3 months ago

I can confirm that going back to v0.8.17 (with a fresh install) works and I cannot reproduce this error in that older version.

lordoffools commented 3 months ago

I can confirm that going back to v0.8.17 (with a fresh install) works and I cannot reproduce this error in that older version.

I take it back. It now reproduces consistently regardless of version.

FYI, I was able to perform multiple full scrapes/downloads earlier this morning, and late last night, without encountering this issue (across multiple creators). It only started to occur once I upgraded.

ZincStoat commented 3 months ago

Same error here, nothing to add that @lordoffools hasn't already mentioned except that it's not just them having the issue.

Oh, I'm on Windows not macOS, otherwise the same.

prof79 commented 3 months ago

Such a crap, can't I just have a nice day once

And btw. I will rephrase to "recommended" because since a while ago, I disarmed any tinkering because of version deviations so upgrading is just a message and the downloader continues to work.

image

prof79 commented 3 months ago

And from all evidence, this must be another ingenious API change on Fansly's part since 0.8.17 exhibits this error as well. And the contributions did not touch anything regarding downloads themselves and I did a test run downloading all deltas of all my creators after the builds with v0.8.19 myself.

ZincStoat commented 3 months ago

And from all evidence, this must be another ingenious API change on Fansly's part

Again. Fantastic, great, what was broken that they had to "fix" it? Hope you can come up with a solution. I'll leave it with you.

lordoffools commented 3 months ago

Such a crap, can't I just have a nice day once

And btw. I will rephrase to "recommended" because since a while ago, I disarmed any tinkering because of version deviations so upgrading is just a message and the downloader continues to work.

image

I hear you. :(

Thanks so much for jumping on this.

prof79 commented 3 months ago

First thing I see is that messages are not broken. But timeline requests yield success but being totally empty 🤦

image

Logging in and grabbing a fresh token and user agent nothing changes but from within the browser their own code/API calls work.

Maybe again some anti-downloader measures like the rate-limiting last year. There are also a lot of fansly-* headers though they already exist for quite some time but could never deduce how to generate them.

prof79 commented 3 months ago

Seems some of that extra header values are mandatory now, suspect some kind of WAF or whatever.

Without fansly-session-id and fansly-client-ts you get a response with all stuff nulled. Figuring them out is the grand prize ...

ZincStoat commented 3 months ago

Seems some of that extra header values are mandatory now, suspect some kind of WAF or whatever.

Oh yeah, when I was trying to analyse an earlier issue which fortunately you got fixed before I could figure it out, I noticed a lot of Cloudflare style stuff going on. Apparently that wasn't the issue then, but it is now. Hope you, and by extension we, win that prize.

prof79 commented 3 months ago

I've tackled many of the problems I hope but couldn't figure out yet how to properly authenticate to the WebSocket server to get the session ID ... and now I'm exhausted ...

e2489 commented 3 months ago

I've tackled many of the problems I hope but couldn't figure out yet how to properly authenticate to the WebSocket server to get the session ID ... and now I'm exhausted ...

Really appreciate your work on this!

prof79 commented 3 months ago

Thanks all! Hope it turns out a win - now I got temporarily banned from Fansly's device service 😂

vajdao commented 3 months ago

Let us know if we can help with anything in the meantime :'D

TheMissingPort commented 3 months ago

Whats going on

prof79 commented 3 months ago

@Avnsx thanks a lot and kudos! Gladly I could figure that out at the weekend from the obfuscated JavaScript code as well. Porting the mathematical/digest funcs (cyrb53) was very difficult and is not as easy as at first glance because JavaScript and Python work differently and >>> does something different in JS than >> in Python does so I had to make sure that the math was done restricted to signed 32-bit integers which Python with its unbounded integers natively doesn't care about. There is a StackOverflow post with test values from the private inventor of this digest. Although in the Postmaster session some days ago fs-client-check seemed not mandatory as of now but will become I guess. Also I had to hunt the JS and DevTools more because two major questions remained, where to get the device ID from and where the session ID; the session ID being the most elusive. Device ID was easier and spottable because there is an API endpoint for that but you need/should cache that due to rate-limiting. For session ID it is necessary to introduce a new beast of WebSocket calls to retrieve the session ID based on the auth token. But you may not call this blindly or get rejected. The original Fansly code does even more like a ping-pong messaging scheme with the WebSocket server I did not care to implement for now. The source code for v0.9.0 is already up, building is giving some troubles but hopefully fixed before going to bed. Maybe need to re-tag v0.9.1 or something.

Avnsx commented 3 months ago

For session ID it is necessary to introduce a new beast of WebSocket calls to retrieve the session ID based on the auth token

I am kinda confused by this statement and also I haven't looked through your whole codebase for fansly-downloader-ng.

I see that you still have plyvel-ci as a requirement in requirements.txt and use it in /config/browser.py. But then why are you trying to receive the session ID from a websocket and the device ID from another request, when you could literally just gather them the same way that browser.py (using plyvel-ci) is able to get session_active_session?

If you're not dependant on plyvel-ci anymore, maybe you should remove the requirement, because they're only available until python 3.11 and not 3.12 yet: https://github.com/liviaerxin/plyvel/releases

Also I think it's kinda funny how a huge company like fansly, has developers which just google hashing functions and then copy paste them into their own code, without even giving any credit / source to the original author lol. But this also means that most likely they're not going to be editing those hashing functions at all and it's literally going to just stay pasted. So maybe they'll only be changing the checkKey_ variable, from time to time

prof79 commented 3 months ago

Have fun ^^

https://github.com/prof79/fansly-downloader-ng/releases/tag/v0.9.1

prof79 commented 3 months ago

@Avnsx tbh I have never really concerned myself with plyvel-ci and its capabilities, took it from your existing code base. Good to know. What I tried to mimick is the web conversation flow seen from DevTools and JS which is browser-independent. BTW I assume session IDs will change frequently and if I understand correctly plyvel-ci needs full (offline) access to the files so browser must be closed. That's nice for reading a long-term token (except logging out) but probably not killing browser session on every downloader start, also consider headless/scheduled usage.

ZincStoat commented 3 months ago

Have fun ^^

https://github.com/prof79/fansly-downloader-ng/releases/tag/v0.9.1

Success! Okay mate you can take a break now, you earned it.

Avnsx commented 3 months ago

Whereas checkKey_ is currently always a static pre-defined value called Qindoj-mitci1-fevtev, which is most likely going to keep being changed in the future, to bother people that are trying to externally use the API, every time it's changed you would need to update your code to reflect the same new value for the key.

Surprise surprise @prof79, they changed the checkKey_ to negwij-zyZnek-wavje1 🤣

They're declaring war, this is kinda funny hahaha

If I was you I would start vibe checking them.

Tl;dr because main.js is a script that's forcefully required to load on every page you open on fansly (because it handles 80% of how the websites backend works in your browser client-sided), it can be loaded without a authentication / sessiontoken entirely, so I would just simply start with hard-setting a request to https://fansly.com/main.b9be8b0a95953668.js and just regexing the most recent value for ``checkKey``.

Most likely they'll soon end up changing the name for the main.js code, which would evidently mean that the url path on their website also changes, in an effort to counter your request & regex to it. In this case you could use any of the many options you would have to imitate a javascript backend in python, which would allow you to always figure out the newest version and location of the main.js script and forever be able regex the newest key for checkKey_

The thing is, both sides (fansly devs vs you as a developer of a open source scraper) can spiral forever in this in many many ways, which is the exact reason I didn't bother to continue maintaining my own version of the scraper, simply don't have time for stuff like this and right now I'm just amazed to see how fansly is doing and it's like a little riddle solving game to me, which is why I am still commenting here 😊

session IDs will change frequently

And regarding your comment towards plyvel-ci, if I remember correctly the session ID would be assigned to a user after the first authenticated request to the fansly server, which basically hardbinds the session-id towards a authentication token, if you utilise the same authentication token, it's very likely for the session-id to also stay the same. But don't quote me on this, you've to start testing these things for yourself.

Regardless, since fansly seems to be bothered with your version of the scraper still existing, I would still fallback on the plyvel-ci method, as that's a completly uncounter-able / fool-proof strategy of receiving the session-id & device-id (which I still think both are static lasting values) every time, as it's taking them directly from the browser storage and fansly NEEDS to reveal the tokens to the browser else none of its users would be able to access the website or atleast would have to log back in every other time users would start browsing the website again.

Then again the way you're saying these things, suggests you've already managed to avoid the usage of plyvel-ci, but like just keep it in the back of your head, that instead of struggling with listening to websockets & doing another request for device-id, you could've literally just in /config/browser.py done something like:

session_id = json.loads(session_active_session).get('id')
device_id = json.loads(device_device_id)

and you would've solved your so called two major questions within like two lines of code :)

Anyways, I'm just giving my supportive 5 cents here, have a great day

xMohawk commented 3 months ago

Just registered an account special for you guys @prof79 @Avnsx, thank you for your teamwork and hard work on this! Right after the new release 0.9.1 I was able to download all the media for about 10 minutes. After 10 minutes the well known message no media did occur. I just saw the new reply of @Avnsx about the new checkkey so that should be a relative easy solution. Thanks @Avnsx & @prof79 for your research and maintaining the only working Fansly downloader at the moment!

bigxd123 commented 3 months ago

@Avnsx @prof79 I actually love your dedication and would love to chime in as well.

But it really seems like Fansly is specifically targeting this scraper as well now. I received warning emails about using 3rd party tools as well as others that already reported getting banned.

It seems like they break things very fast now which (if they are doing it right) is 5 minutes of work for them and requires a lot more time to figure out what changed again for the maintainers.

Your solution of scraping the source is a great idea, however since they are targeting repos they will very quickly break the regex and just rename their variable names which is equally as fast as changing that key. Not to mention the url you linked already has a checksum in there which likely changes automatically every frontend update anyways due to the service worker doing partial updates.

I think the main thing to consider is that if fansly is actively breaking scraper they are guaranteed also tracking who uses them and flag / ban users, which imo is the biggest thing to consider and to put the most time into since that is somewhat of a responsibility as well.

But ye some tips, if you do regex (I guarantee they will try break this exact regex lol) find the service worker manifest and actually read the correct and current checksum of the src to not immediately break again when they do an update.

Other than that I salute you guys spending so much time, especially while being directly monitored by them, lets hope they don't have a lot of resources to counter this and you guys will find their changes faster than it takes them to break it. I will also try and help but it really seems like they break things immediately, maybe just a coincidence tho.

Bum211 commented 3 months ago

Thank you both for doing this! Was wondering there will be a new release soon? Cheers!

prof79 commented 3 months ago

Since they seem to now also terminate accounts using the downloader maybe I should just give up on this. My work is tempting enough and I do not have the time nor fancy to play daily cat-and-mouse with the Fansly security team in my spare time and I also do not have the resources to reverse and code an exact replica of the original web app. I'm a paying customer for some creators and I am not a pirate, just want to use/browse my lib on VR goggles - I guess they totally lack imagination of use cases. They'll certainly lose me as a paying customer.

I might do a version where people can plug in their own check key retrieved from the sources by whatever means.

Thanks for all the input @Avnsx however regarding plyvel-ci I have to object - you're sometimes to narrow-minded. There is a French person for example who wrote a Dockerfile and runs the downloader in a container on their NAS. How would you poll/operate a browser when a browser isn't even installed? I've been in IT for almost 24 years, shortcuts sooner or later always come to haunt you.

Avnsx commented 3 months ago

you're sometimes to narrow-minded.

Bold assumption of you @prof79 . The principle I follow while writing my code (atleast for code that I don't necessarily get paid for), is putting in minimum effort, minimum input time consumption and creating code that'll work for a broader mainstream audience. While what you're saying is definetly true, I could not care less, if someone can not execute my code on their NAS because they can't run a browser there, my target is the other 98% of average consumers and every adaptation for any other special case beyond that, is kind of a waste of time in my book. I think you got to see this first hand, from having to re-write my original scraper code, no? 🤣

prof79 commented 3 months ago

Btw please try v0.9.3 people.

ZincStoat commented 3 months ago

Btw please try v0.9.3 people.

Before I try this is there anything we need to do beforehand? I see references in the changes about ensuring checkKey is correct? Is that something the app will do for you?

prof79 commented 3 months ago

@ZincStoat The app will prompt you in interactive mode to verify the key. As of two hours ago or so what Avnsx found was still correct and the app uses the newer key as a default but that may age so it asks every time to be sure and you can configure it in config.ini then. I'm currently unable to implement a scraping for a dynamically named JS file where the obfuscation and variable naming is prone to change in the near future. So I considered this the best option for now without having to deal in the code and re-compile every time.

ZincStoat commented 3 months ago

@ZincStoat The app will prompt you in interactive mode to verify the key. As of two hours ago or so what Avnsx found was still correct and the app uses the newer key as a default but that may age so it asks every time to be sure and you can configure it in config.ini then.

Gotcha, okay I can see where to find that. Simple option is to view page source, find a link to main.whatever.js, click on that then find the value of checkKey. I think I can live with that. Thanks again for your perseverance.

Joly0 commented 3 months ago

Hm, i am running headless with this command python3.11 fansly_downloader_ng.py -ni -npox and my config.ini has this line check_key = negwij-zyZnek-wavje1 under the "myaccount" section underneath "user_agent" but i get this error everytime i try to run it:

WARNING | 02:17 || Make sure, checking the main.js sources of the Fansly homepage, 
                    that the `this.checkKey_` value is identical to this 
                    (text within the single quotes only): `negwij-zyZnek-wavje1`

Press <ENTER> to attempt to continue ...
 ERROR | 02:17 || An unexpected error occurred: EOF when reading a line
Traceback (most recent call last):
  File "/fansly-downloader-ng/fansly_downloader_ng.py", line 198, in <module>
    exit_code = main(config)
                ^^^^^^^^^^^^
  File "/fansly-downloader-ng/fansly_downloader_ng.py", line 108, in main
    validate_adjust_config(config)
  File "/fansly-downloader-ng/config/validation.py", line 427, in validate_adjust_config
    validate_adjust_check_key(config)
  File "/fansly-downloader-ng/config/validation.py", line 340, in validate_adjust_check_key
    input_enter_continue()
  File "/fansly-downloader-ng/textio/textio.py", line 98, in input_enter_continue
    input('\nPress <ENTER> to attempt to continue ...')
EOFError: EOF when reading a line

Exiting in 15 seconds ...
ZincStoat commented 3 months ago

Gotcha, okay I can see where to find that. Simple option is to view page source, find a link to main.whatever.js, click on that then find the value of checkKey. I think I can live with that. Thanks again for your perseverance.

They nearly got me this morning, the key had changed by just one letter, zyZnek to zyZnak. I can see I'm going to need to check before each run. I only use it when I know there's stuff to download, usually once a day or so, so not that big a chore, just need to be vigilant.

ZincStoat commented 3 months ago

@ZincStoat The app will prompt you in interactive mode to verify the key. As of two hours ago or so what Avnsx found was still correct and the app uses the newer key as a default but that may age so it asks every time to be sure and you can configure it in config.ini then. I'm currently unable to implement a scraping for a dynamically named JS file where the obfuscation and variable naming is prone to change in the near future. So I considered this the best option for now without having to deal in the code and re-compile every time.

Because I'm lazy (?) I just spent the whole evening working out how to code a C# solution to fetch the value of checkKey_ and optionally update config.ini, but the program keeps putting the default value back. Why would that be? The only difference I can see between the file my updater writes back, and what fansly-downloader rewrites is check_key itself.

prof79 commented 3 months ago

@Joly0 sorry my bad with the caution message see #43

prof79 commented 3 months ago

@ZincStoat The app will prompt you in interactive mode to verify the key. As of two hours ago or so what Avnsx found was still correct and the app uses the newer key as a default but that may age so it asks every time to be sure and you can configure it in config.ini then. I'm currently unable to implement a scraping for a dynamically named JS file where the obfuscation and variable naming is prone to change in the near future. So I considered this the best option for now without having to deal in the code and re-compile every time.

Because I'm lazy (?) I just spent the whole evening working out how to code a C# solution to fetch the value of checkKey_ and optionally update config.ini, but the program keeps putting the default value back. Why would that be? The only difference I can see between the file my updater writes back, and what fansly-downloader rewrites is check_key itself.

Because I screwed it up 😖 I should code that in, with some distance should be pretty easy having an epiphany - but renaming the variable but also have an idea regarding that.

ZincStoat commented 3 months ago

Because I screwed it up 😖 I should code that in, with some distance should be pretty easy having an epiphany - but renaming the variable but also have an idea regarding that.

Fair enough. Things got a bit exciting there for a while. The program's working at least, just with some quirks. I'll leave it with you.

prof79 commented 3 months ago

@Joly0 @ZincStoat few mins and v0.9.6 fixing headless and including automatic check key retrieval 😊 Also, loading the key from config.ini is fixed but will be overwritten from the web if found there. User only gets asked to check if Internet retrieval failed. Patterns to locate JS file in HTML and key in JS are configurable outside the program.

prof79 commented 3 months ago

https://github.com/prof79/fansly-downloader-ng/releases/tag/v0.9.6

fl4shforward commented 3 months ago

https://github.com/prof79/fansly-downloader-ng/releases/tag/v0.9.6

Testing it rn, for you info, there seems to be an issue with the version check in headless too. I forgot to build a new docker image lol.

]0;Fansly Downloader NG v0.9.3
  ███████╗ █████╗ ███╗   ██╗███████╗██╗  ██╗   ██╗    ███╗   ██╗███████╗     █████╗ ██████╗ ██████╗ 
  ██╔════╝██╔══██╗████╗  ██║██╔════╝██║  ╚██╗ ██╔╝    ████╗  ██║██╔════╝    ██╔══██╗██╔══██╗██╔══██╗
  █████╗  ███████║██╔██╗ ██║███████╗██║   ╚████╔╝     ██╔██╗ ██║██║ ███╗    ███████║██████╔╝██████╔╝
  ██╔══╝  ██╔══██║██║╚██╗██║╚════██║██║    ╚██╔╝      ██║╚██╗██║██║  ██║    ██╔══██║██╔═══╝ ██╔═══╝ 
  ██║     ██║  ██║██║ ╚████║███████║███████╗██║       ██║ ╚████║███████║    ██║  ██║██║     ██║     
  ╚═╝     ╚═╝  ╚═╝╚═╝  ╚═══╝╚══════╝╚══════╝╚═╝       ╚═╝  ╚═══╝╚══════╝    ╚═╝  ╚═╝╚═╝     ╚═╝     
                        developed on github.com/prof79/fansly-downloader-ng
                                               v0.9.3
 Info | 20:41 || Reading config.ini file ...
 WARNING | 20:41 || A new version of Fansly Downloader NG has been found on GitHub - update recommended.
 ERROR | 20:41 || An unexpected error occurred: 'release_version'
Traceback (most recent call last):
  File "/usr/src/fansly-ng/fansly_downloader_ng.py", line 198, in <module>
    exit_code = main(config)
                ^^^^^^^^^^^^
  File "/usr/src/fansly-ng/fansly_downloader_ng.py", line 106, in main
    self_update(config)
  File "/usr/src/fansly-ng/updater/__init__.py", line 23, in self_update
    check_for_update(config)
  File "/usr/src/fansly-ng/updater/utils.py", line 268, in check_for_update
    return perform_update(config.program_version, new_release)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/src/fansly-ng/updater/utils.py", line 107, in perform_update
    f"\n{16*' '} Version: {release_info['release_version']}"
                           ~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^
KeyError: 'release_version'

[EDIT]

Seems to work fine on my side. Will have to check if I get banned in the following days.

prof79 commented 3 months ago

@fl4shforward need to look but careful - heat is now on, they have changed their code mins (!!!!) after v0.9.6 and made it dynamic!!!!

Heat is on now also @ZincStoat @Joly0 et al

heClear_=0,this.checkKey_=["fyszis","qybZy9"].reverse().join("-")+"-bybxyf",th

Now the check key is a JavaScript expression dynamically evaluated!

bigxd123 commented 3 months ago

@fl4shforward need to look but careful - heat is now on, they have changed their code mins (!!!!) after v0.9.6 and made it dynamic!!!!

Heat is on now also @ZincStoat @Joly0 et al

heClear_=0,this.checkKey_=["fyszis","qybZy9"].reverse().join("-")+"-bybxyf",th

Now the check key is a JavaScript expression dynamically evaluated!

You think they actually need this key to detect tools or is that just to keep you and other maintainers busy?

prof79 commented 3 months ago

@bigxd123 well this was definitely a measure to stop automatic key retrieval from their homepage. I'll certainly not even try to tinker with some sandbox JavaScript evaluator/parser that, if even possible, extracted this in the clear. For a human with basic coding knowledge it is relatively easy but regular people probably won't know what and how to do.

The key in itself is a means to produce a hash over some essential parts like the URL path so they can tell valid requests from within their web app from invalid/forged ones.

bigxd123 commented 3 months ago

@prof79 I see, idk if it helps but for some reason I now get logged out randomly after using it. Hopefully not connected to it though.

I will wait until it blows over again. I kinda get why they don't like scraper because of all the leakers but it also hurts legit users. Probably also a challenge for them.

prof79 commented 3 months ago

@bigxd123 Are you using the new key? Or has it changed now on a minute-basis? (-> v0.9.7)

Btw that underlines the reason why I had not wanted to do the key scrape in the first place - I'm sure they can do even more in terms of obfuscation or other techniques. At least I brushed up my knowledge of the Python re module ;-)

prof79 commented 3 months ago

And, btw, @bigxd123 and others, seems like they have changed something else - doesn't seem to work for me any longer

bigxd123 commented 3 months ago

And, btw, @bigxd123 and others, seems like they have changed something else - doesn't seem to work for me any longer

@prof79 you sure they maybe just logged you out as well? Mine also didn't work on the actual website until I relogged.

bigxd123 commented 3 months ago

I had to copy my authorisation token again, then it worked again. Still works but Im scared to continue at least today.