Voldrix / onlyfans-dl-2

OnlyFans content downloader v2
GNU General Public License v3.0
168 stars 24 forks source link

Messages with count over 200 do not save #25

Closed Starwalker98 closed 3 years ago

Starwalker98 commented 3 years ago

Hi, I have encountered something interesting. In a chat where I have over 200 messages, script says that only 200 were found, and then 0 new downloaded. I can provide more info if needed.

Voldrix commented 3 years ago

I can't reproduce the error, I don't have that many messages, but I'll see if I can figure it out. Just a couple diagnostic questions... Are you using the latest commit? Was the script able to download messages for this profile previously? Are you using the content type subfolders option (where messages would have its own folder)? Are you able to download messages from other profiles? Out of those 200+ messages, how many of them have media attached? Only the media gets downloaded.

Starwalker98 commented 3 years ago

Hi,

Yep, latest commit. Yes, up until it said "200 messages found", it was working fine. Yes, for me messages have its own folder.

Other profiles are fine, though I don't have close to 200 messages anywhere else.

I counted this roughly via gallery, there should be about 180 photos/videos.

However, even though I receive new messages daily (with photos as well), script still says 200 found.

Hope this helps

ckanders2020 commented 3 years ago

i am having this same issue - with multiple profiles - it gives the 200 messages found but doesn't download any of the new pictures or videos

ZZ1235 commented 3 years ago

I can confirm this issue exists, even with the latest commit.

blmatthews commented 3 years ago

I'm pretty sure I'm seeing this as well. While I don't know how many messages I have, I'm sure it's well over 200, and when I download, it says "Found 200 messages", and it's only downloading 6 photos from messages, while I have a lot more than that, as well as a number of videos. It would be nice to be able to grab everything.

blmatthews commented 3 years ago

Ok, found the problem and came up with a fix (maybe not the best fix, but it works.) The problem is the second condition on line 99:

            if (apiType == 'messages' and list_extend['hasMore'] == False) or len(list_extend) < posts_limit:

For messages, the data structure returned by requests.get will be a two-element dictionary, with the 'list' element containing the actual messages. That means len(list_extend) < posts_limit will always be true (as long as posts_limit is >= 2), so the code will break out of the while 1 on line 92 after the first extended "page" (messages 100-200) is retrieved. I rearranged the code a bit so messages is handled specially:

$ diff -c onlyfans-dl.py.orig onlyfans-dl.py
*** onlyfans-dl.py.orig 2021-08-28 15:53:51.000000000 -0700
--- onlyfans-dl.py  2021-08-28 16:12:04.000000000 -0700
***************
*** 94,102 ****
            list_extend = requests.get(API_URL + endpoint, headers=API_HEADER, params=getParams).json()
            if apiType == 'messages':
                list_base['list'].extend(list_extend['list'])
!           else:
!               list_base.extend(list_extend) # Merge with previous posts
!           if (apiType == 'messages' and list_extend['hasMore'] == False) or len(list_extend) < posts_limit:
                break
            if apiType == 'purchased' or apiType == 'subscriptions':
                getParams['offset'] = str(int(getParams['offset']) + posts_limit)
--- 94,105 ----
            list_extend = requests.get(API_URL + endpoint, headers=API_HEADER, params=getParams).json()
            if apiType == 'messages':
                list_base['list'].extend(list_extend['list'])
!               if list_extend['hasMore'] == False or len(list_extend['list']) < posts_limit:
!                   break
!               getParams['offset'] = str(len(list_base['list']))
!               continue
!           list_base.extend(list_extend) # Merge with previous posts
!           if len(list_extend) < posts_limit:
                break
            if apiType == 'purchased' or apiType == 'subscriptions':
                getParams['offset'] = str(int(getParams['offset']) + posts_limit)

I didn't do extensive testing. For one thing I only subscribe to one profile. But I did have it download everything again, and the only difference was the addition of photos and videos it had missed before because it was only looking in the first 200 messages. Also, I sent a new message, then stepped through the script in a debugger and verified that the new message was present and the last message in the list (being the list is sorted ascending by date.) As it was the 572nd, it wouldn't have showed up without the changes.

One thing I just noticed while writing this, I used:

                getParams['offset'] = str(len(list_base['list']))

while later in the code it uses:

                getParams['offset'] = str(int(getParams['offset']) + posts_limit)

I'd argue mine is more correct in that it uses the actual count of items retrieved so far, although in practice they'll be the same because it will bail out of the while 1 loop if len(list_extend) < posts_limit, although then the second way to set getParams['offset'] depends on that test while the first doesn't. Regardless, they should probably be the same in the actual committed code, if only not to confuse people.

Voldrix commented 3 years ago

Thanks for identifying the issue. I can't test this myself as I don't have that many messages, but you are right about needing to count the list section of the array. I implemented your change, so if anyone still encounters this issue let me know. As far as concurrency on the array counting methods, I think the reason I initially used this method was that counting the really large arrays took a lot more time than just adding the assumed value. I don't remember how bad the difference was but I think I'll leave it the quicker way unless it becomes an issue.

blmatthews commented 3 years ago

Thanks for applying the change. I'm kind of surprised len(array) isn't just constant time, but I don't know anything about the internals of Python, so there's probably some reason I'm not thinking of. Anyways, if OF behaves, offset = offset + limit should work fine. Thanks again!