scrazzz / redgifs

Simple Python API wrapper for the RedGIFs API
https://redgifs.rtfd.io
MIT License
93 stars 13 forks source link

`data.total` is not always correct while using `Order.best` #25

Closed acctfpn closed 1 year ago

acctfpn commented 1 year ago

Summary

data.totalcan be bigger than the actual number of gifs

Reproduction Steps

Has only failed me so far with the user and sorting from the script below.

Uncomment one of the Test 1/2/3 lines and run

from redgifs import API, Order

api = API()
api.login()

# Search for the user
USERNAME = 'jenovakitty1'

# Test 1 (Works)
# data = api.search_creator(USERNAME, order=Order.new, count=100)

# Test 2 (Fails)
# Output:
# Downloaded 37 out of 157
# total_gifs: 37
# data.total: 157
# total_pages: 2
data = api.search_creator(USERNAME, order=Order.best, count=100)

# Test 3 (Fails)
# Output: (only 31 gifs this time, this user has 6 images so must be related to that)
# Downloaded 31 out of 157
# total_gifs: 31
# data.total: 157
# total_pages: 2
#data = api.search_creator(USERNAME, order=Order.best, count=80)

total_pages = data.pages
current_page = data.page

# This is the total gifs in the `current_page`
total_gifs = data.gifs

while True:
    for i, gifs in enumerate(total_gifs, start=1):
        try:
            # We do the downloading here.
            # Make sure you have a folder called "downloads" in the current directory or else make a new one.
            # api.download(gifs.urls.hd, f'downloads/{i}.mp4')

            # Print a message to keep track of the downloads
            print(f'Downloaded {i} out of {data.total}')

        except Exception as e:
            raise Exception(f'An error occured while donwloading:\n{e}')

    # Update the current page number
    current_page += 1
    if current_page <= total_pages:
        print(f'total_gifs: {str(len(total_gifs))}')
        print(f'data.total: {str(data.total)}')
        print(f'total_pages: {str(total_pages)}')
        # Clear the old gifs from the previous page
        total_gifs.clear()
        # Make a new API call to get the gifs from the next page
        data = api.search_creator(USERNAME, page=current_page)
        # Update `total_gifs` with the new gifs
        total_gifs.extend(data.gifs)
    else:
        break

print('Completed!')

Expected Result

All tests detect 37 gifs in one page and finish correctly.

Actual Result

Test 2 and 3 detect 157 instead of 37 total gifs and fail as a result.

System Information

Checklist

Additional Information

No response

scrazzz commented 1 year ago

Good catch, I haven't documented what data.total is so I don't know what that value actually represents.

Will check on this and update you soon:tm:.

scrazzz commented 1 year ago

Okay so upon looking up at the user in your example, it shows that there's only 37 posts (31 GIFs and 6 images) on that user's profile.

There's only 4 orders to sort the posts on the website: Latest, Oldest, Trending, and Top.

  1. The order by default is Latest which shows all the 37 posts.
  2. Switching the order to Oldest shows the 37 posts and displays the oldest post first (i.e, oldest -> latest ordered).
  3. Switching order to Trending shows no posts (I assume there's a minimum score/quota to be considered it as a trending post).
  4. Switching order to Top shows 36 posts (31 GIFs and 5 images). Not sure how or what does not consider the other one GIF as Top, I assume there's a quota for this too to be considered as a top post.

Now coming back to your issue. The data.total value may not represent the total posts everytime since it depends on what Order you use.

If you want to get all the GIFS, you'll have to iterate through total_pages and download each GIF (total_gifs) until you reach the last page. Since you're on 1.7.2, I'm not sure what Order would work best for you. If you upgrade to 1.8.0 you can use Order.latest and get the exact number of posts.

I've decided to remove the other Order enums and only use the ones available on the website on the next version. Also I'll be making a new example that showcases on how to download all the GIFs from a user's profile.

acctfpn commented 1 year ago

Switching order to Top shows 36 posts

I think this is incorrect. It shows 37 for me on the webpage but I have not tested it with the API. My understanding is that this Top works like on reddit. It will show every post, even low rated ones, but sort them from top to bottom.

Which sorting in the api corresponds to Top in the web? I assumed it was Best, but there's also Top28 which I haven't tried.

acctfpn commented 1 year ago

Now coming back to your issue. The data.total value may not represent the total posts everytime since it depends on what Order you use.

Is it data.total calculated or just returned as is from the redgifs API? In the latter I guess I should not rely on it being accurate and/or catch any errors.

I wonder why it gives such a high number when the user does not even have that amount of posts. The order should not affect that imo, in any case it should decrease the amount of posts (trending) not increase them.

In any case I'll probably wait till the 1.8.0 release and if this still happens with any sorting implement error catching.

scrazzz commented 1 year ago

Is it data.total calculated or just returned as is from the redgifs API?

It's the data from the API. I don't do any calculations with it.

I wonder why it gives such a high number when the user does not even have that amount of posts.

It's RedGifs's fault. They are known to make changes like this on production causing issues. But then again, I really can't blame them because they don't like third-party libraries using their internal APIs. Anyhow, for the time being you shouldn't rely on it.

scrazzz commented 1 year ago

I think this is incorrect. It shows 37 for me on the webpage but I have not tested it with the API.

Maybe the post got enough score later on when you checked which made it into the Top category...?

Which sorting in the api corresponds to Top in the web? I assumed it was Best, but there's also Top28 which I haven't tried.

The Top order is new, it was used to be called Top28 in the API. I'll have to check and update the enums again before releasing 1.8.0.

acctfpn commented 1 year ago

I should implement error catching then. Can't complain if it's an internal API but to be fair I wouldn't be here if their webpage worked minimally well on mobile.

Maybe the post got enough score later on when you checked which made it into the Top category...?

Didn't thought of that, maybe you are right.

The Top order is new, it was used to be called Top28 in the API.

I see, thanks for the help.

scrazzz commented 1 year ago

Whoops, closed the wrong issue

scrazzz commented 1 year ago

There really isn't much to do here on my side tbh. It's the API's fault for returning false data. Use the first order (Order.new) to download all the GIFs since it works for you.

I'm still debating whether to drop support for existing orders like top28, best, etc... since it works using the API but isn't publicly available through the website UI...